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FOREWORD 


The  Army  Mathematics  Steering  Committee  (AMSC)  sponsors  annually  the 
conferences  entitled  the  Design  of  Experiments  in  Army  Research,  Development 
and  Testing.  The  thirty-second  one  In  this  series  had  as  its  host  the  US 
Arn\y  Comhat  Development  Experiment  Center  (USACDEC)  and  was  held  29-30 
October  1986  at  the  Hilton  Inn  Resort,  Monterey,  California.  Dr.  Marlon  R. 
Bryson,  Director  of  USACDEC,  served  as  the  local  host  and  conference 
coordinator  not  only  for  this  conference  but  also  for  the  twenty-third  and 
twenty-eighth  Design  of  Experiments  meetings.  The  members  of  the  AMSC 
appreciates  his  efforts  and  the  efforts  of  his  staff  In  coordinating  the  many 
details  needed  to  conduct  all  three  of  these  symposia,  ‘ 

The  Special  Session  this  year  was  entitled  "Field  Experimentation:  The 
Analysis  of  Messy  Data."  There  were  three  invited  papers  presented.  The 
papers  of  Professors  Dallas  E.  Johnson  and  John  Tukey  discussed  the  analysis 
of  messy  data,  while  the  Joint,  authored  paper  by  Drs.  Marion  R.  Bryson  and 
Carl  T.  Russell  presented  some  of  the  problems  of  scoring  casualties  In  field 
trials.  The  titles  of  the  technical  and  clinical  sessions  give  some  idea  of 
the  many  statistical  areas  treated  In  the  contributed  papers:  (1)  Parametric 
Statistics, (2)  Statistical  Theory,  (3)  Design  of  Experiments,  (4)  Data 
Analysis  end  Modeling,  (5)  Theory  and  Probabllstlc  Inference,  (6)  Fuzzy 
Statistics,  (7)  Forecasting  and  Prediction,  (8)  Small  Sample  Analysis,  and  (9) 
Regression  and  Smoothing.  The  program  Committee,  for  the  Invited  speaker  phase 
of  the  conference,  obtained  the  following  nationally  known  scientists  to  talk 
on  topics  of  current  interest  to  Army  personnel  as  well  as  other  attendees. 


Speaker  and  Affiliation 

Professor  George  E.P.  Box 
University  of  Wisconsin 

Professor  Walter  T.  Federer 
Cornell  University 

Professor  Persl  Dlaconls 
Stanford  University 

Professor  Emanuel  Parzen 
Texas  A&M  University 

Professor  Stuart  Geman 
Brown  University 


Titles  of  AdJress 

Statistical  Design,  Analysis  for 
Quality  Improvement 

Statistical  Analysis  for 
Intercropping  Experiments 

The  Search  for  Randomness 


Quantile  Statistical  Data 
Analysl s 

Some  Applications  of  Bayesian 
Image  Analysis 


The  conference  was  preceded  by  a  two-day  tutorial  on  "Density  Estimation, 
Modeling  and  Simulation"  by  Professor  James  Thompson  of  Rice  University,  The 
dates  for  the  tutorial  were  Monday  and  Tuesday  P7-28  October  1986.  Professor 
Thompson  has  conducted  extensive  research  In  the  areas  covered  In  his 
lectures.  His  approach  to  the  presented  material  was  excellent  and  he 
generated  many  Interesting  discussions. 

Dr.  Francis  G.  Dressel  was  the  recipient  of  the  Sixth  Wilks  Award  for 
contributions  to  Statistical  Methodologies  In  Army  Research,  Development  and 
'estlng.  Dr,  Dressel  was  uniquely  qualified  by  virtue  of  his  service  In  the 
Mathematical  Sciences  Division  of  the  U.S.  Army  Research  Office  over  three 
decades.  He  was  one  of  the  principals  at  the  Inception  of  the  Arn\y  Design  of 
Experiments  Conference  and  along  with  Sam  Wilks,  planned  and  implemented  the 
then-fledgling  conference.  Dr.  Dressel  currently  serves  as  editor  of  the 
Conference  Proceedings  and  continues  to  contribute  to  the  advancement  of 
statistics  In  the  U.S.  Army. 

The  AMSC  would  like  to  thank  the  members  of  the  conference  committee  for 
gulfing  this  excellent  scientific  conference,  and  to  also  thank  the 
Mathematical  Sciences  Division  of  the  Army  Research  Office,  for  preparing  the 
proceedings  of  these  meetings. 
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Statistical  Analyses  for  Intercropping  taper leant a 


Walter  T.  Federer 
Cornell  University 
Ithaca,  N.Y.  14853 


Abstract 

Statistical  methodology  for  analysing  intercropping  experiments  was 
developed  over  the  last  20  years  and  is  being  developed  at  present.  Con¬ 
siderably  more  research  is  required  for  the  many  and  divsrse  types  of  ex¬ 
periments  involving  sole  crops  (crops  grown  alone)  and  mixtures  of  crops 
(intercrops)  grown  together  or  in  sequence.  The  growing  of  two  or  more 
crops  together  or  in  sequence  is  known  as  intercropping.  An  outline  of 
twenty  chapters  of  a  book  on  the  statistical  design  and  analysis  of  inter¬ 
cropping  experiments  is  presented.  A  number  of  the  statistical  analyses  in 
the  book  ara  briefly  described.  Sections  2  to  8  relate  to  analyses  for  two 
crops  in  a  mixture  along  with  sole  crops.  Sections  9  to  15  discuss 
analyses  for  three  or  more  crops  in  a  mixture  in  addition  to  sole  crops  and 
mixtures  of  two  crops.  It  is  stressed  that  it  is  dangerous  to  extrapolate 
from  sole  crop  responses  to  mixtures  of  two  crops  and  from  mixtures  of  k 
crops  to  mixtures  of  k  +  1  crops.  Many  of  the  data  sets  examined  produced 
unexpected  and  sometimes  surprising  results.  The  last  section  discusses 
other  areaa  of  application,  e.g.,  survey  sampling,  nutrition,  education, 
medicine,  and  recreation,  where  these  results  can  bw  utilised. 
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1 .  Introduction 

Intercropping  investigations  involves  the  growing  of  two  or  more  crops 
on  the  same  area  of  land  either  simultaneously,  partially  at  the  same  time, 
or  sequentially.  It  is  a  centuries  old  practice  in  tropical  agriculture, 
and  to  some  extent  in  temperate  tone  agriculture.  Agricultural,  biolog¬ 
ical,  and  statistical  investigations  have  tended  to  ignore  the  problems  of 
research  In  this  ares.  Statistical  analysis;  of  intercropping  investiga¬ 
tions  is  considered  to  be  the  most  Important  unsolved  statistical  question 
related  to  research  in  tropical  agriculture.  It  is  an  area  neglected  by 
all  except  a  handful  of  statisticians.  A  computer  search  of  statistical 
literature  resulted  in  the  single  paper  citation  tor  Mead  and  Riley  (1981). 
This  is  an  excellent  paper,  though  limited  in  outlook  for  the  broad  range 
of  statistical  analyses  useful  in  intercropping  research. 
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To  acquaint,  th#  atatiatlcal  profession  with  relevant  procedures  and  to 
fill  a  need  by  intercropping  reaearchera,  a  book  ia  being  published  by  this 


author  on  the  topic.  The  table  of  contents  is: 


Part  I  -  Two  Crops 

Chapter 

1. 

Introduction 

Chapter 

2. 

One  main  crop  grown  with  a  supplementary  crop 

Chapter 

3. 

Both  cropa  main  crops  -  density  constant  -  analyses  for 

aach  crop  separately 

Chapter 

4. 

Both  cropa  main  crops  —  danaity  constant  “  combined  crop 

responses 

Chapter 

5. 

Both  crops  of  major  interaat  with  varying  danaitiaa 

Chapter 

6. 

Monoculturss  and  thaii  pairwise  combinations  when  re¬ 
sponses  are  available  for  aach  member  of  the  combination 

Chapter 

7. 

Monocultures  and  their  pairwise  combinations  whan 

separata  crop  responses  era  not  available 

Chapter 

8. 

Spatial  and  density  arrangements 

Chapter 

9. 

Some  variations  for  Intercropping 

Part  II  -  Three  or  More  Crops 


Chapter  10. 
Chapter  11. 
Chapter  12. 
Chapter  13. 
Chapter  14. 

Chapter  13. 


Introduction 
One  main  crop 
Three  or  more 
Three  or  more 
Monoculturos 
available  for 
Monocultures 
responses  are 


with  more  than  one  supplementary  crop 

main  crops  -  density  constant 

main  crops  ~  density  variable 

and  their  combinations  when  responses  are 

each  crop 

and  their  combinations  when  separate  crop 
not  available 
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Chapter  16  Spatial  and  density  arrangements  for  three  or  more  crops 


L'/'l* 


Chapter  17  Veriationa  for  intercropping  of  three  or  store  crops 


Pert  III  -  Additional  Topics 


Chapter  18  Experiment  design  for  intercropping  experiments 


Chapter  19  Other  areas  of  application 


Chapter  20  Bibliography  on  intercropping  investigations 


It  is  necessary  to  fully  comprehend  the  nature  of  two  crop  mixtures 


before  proceeding  to  anything  more  difficult.  The  interpretations! 


difficulty  increases  by  an  order  in  magnitude  when  going  from  sole  crop 


(crops  grown  alone)  experiments  te  experiments  with  sole  crops  and  biblends 


(mixture  of  two  crops.)  It  goes  up  another  order  in  magnitude  in  going 


from  intercropping  experiments  with  two  crops  to  experiments  Involving 


mixtures  of  three  or  more  crops.  In  addition  to  the  interpretational 


difficulty,  it  is  dangerous  to  extrapolate  from  sole  crops  to  biblends  and 


from  biblends  to  mixtures  Involving  three  or  more  crops.  It  is  dangerous 


to  extrapolate  from  lower  densities  to  higher  ones.  Many,  if  not  most. 


experiments  contain  an  unexpected  result . 


A  number  of  statistical  analyses  found  useful  for  Intercropping  in¬ 


vestigations  are  discussed  below.  The  topics  follow  the  table  of  contents 


of  a  forthcoming  book  that  is  outlined  above. 


2.  One  Main  Crop  Plus  one  Supplementary  Cro 


The  experiment  designs  found  useful  for  sole  crops  will  be  the  same 


ones  found  useful  for  one  main  crop  grown  with  a  supplementary  crop.  The 


treatment  design  conaistn  of  the  varieties  of  a  main  crop  grown  as  sole 


crops  and  in  combination  with  varieties  of  the  supplementary  crop.  To 
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Illustrate,  suppose  that  five  -  c  varieties  of  maize  are  to  be  grown 

m 

alone  and  In  combination  with  six  •  varieties  of  beane.  A  single  den¬ 
sity  for  maize  and  for  beans  la  selected,  i.e.  plant  population  per  hectare 
la  not  a  variable.  The  treatment  design  would  be: 


Cropping  System 


Maize 

Variety 


Bean  Variety 
3  4  5 


There  would  b»  v  •  +  cbcm  "  ^  treatments  composed  of  five  sola  crops 

and  30  blblanda,  Experiment  designs  appropriate  for  36  treatments  would  be 
uaod  (see  e.g.,  Federer  and  Klrton,  1984.) 

Statistical  analyses  for  experiments  in  a  given  experiment  design  and 
for  the  above  treatment  design  would  involve  the  same  types  of  statistical 
analyaes  as  used  for  sole  crop  experiments  (see  e.g.,  Snedecor  and  Cochran, 
1967.)  Some  common  statistical  procedures  used  would  be 

(i)  single  (or  subsets  of]  dsgros(s)  of  freedom  contrasts, 

(11)  multiple  comparisons  procedures, 

(ill)  subset  selection  procedures, 

(lv)  covariance  analyses,  and 
(v)  multivariate  analyses. 

Some  additional  statistical  analyses  found  useful  for  yields  are: 

(vi)  Tukey's  one-degree-of-f reedom  analysis  for  the  crop  one  by 
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crop  two  Interaction, 


(vli)  Finlsy-Wi lkinson  (1963)  analysis  for  mixtures, 

(vii:l)  teat*  for  interaction  given  that  one  or  more  of  the  c  raaite 

in 

varieties  are  standards  for  comparison,  and 
(ix)  yields  of  main  crop  are  not  to  be  reduced  by  more  than  a  fixed 
percentage. 

3.  Two  Main  Crops  ~  Density  Constant 


Experiment  design  considerations  for  biblende  when  both  crops  are  main 
crops,  are  the  same  as  discussed  in  Section  2.  The  treatment  design  would 
have  sola  crops  of  both  crops  Included;  otherwise,  it  is  the  same  as  die* 
cuased  in  Section  2.  Statistical  analyses  on  the  yields  of  each  crop 
separately  would  follow  that  outlined  in  the  previous  seotion. 

In  order  to  evaluate  cropping  systems  and  to  compare  biblend  produc¬ 
tion  with  sole  crop  production,  it  is  necessary  to  combine  the  yields  of 
both  crops  in  soma  meaningful  manner.  An  economic  point  of  view  would 
place  a  value,  v^,  on  the  produce  from  crop  i,  say  and  use 

V  •  VjYj  +  Vg  Yj-  If  are  prices,  it  might  be  more  realistic  to  use 

ratios  of  prices,  which  are  more  stable,  and  use  relative  values 

it  it 

V  ■  Yj  ♦  YjtVj/Vj).  For  sole  crops,  V(or  V  could  be  obtained  by  putting 

Kj  ■  0  for  crop  one  and  Y^  «  0  for  crop  two.  A  nutritional  point  of  view 
would  convert  the  yield  to  calories  and/or  protoin  and  use  a  measure  of  the 
form!  C  ■  CjYj  *  c2^2*  ci  *  calorie  (or  protein)  conversion  factor. 

An  agronomic  or  land  use  point  of  view  would  consider  a  linear  combination 
of  yields  of  the  form: 


L 


I 
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where  Y,  .  is  the  yield  of  crop  i  in  a  biblend  mixture  and  Y  ,  is  the  yield 
bi  si 

of  crop  i  grown  ••  a  sole  crop.  There  are  many  forma  of  ,  which  is 
called  relative  yield  or  land  equivalent  latio.  The  component  yields  of 
the  mixture  are  put  into  proportions  of  yields  obtained  from  sole  crop 
yields.  Since  yields  may  vary  considerably,  a  ratio  of  sole  crop  yields 
might  be.  more  stable.  In  this  case  a  "relative  land  equivalent"  ratio 
would  be  computed  as 

L*  ■  VM  ♦  Yb2(Y.l'Y.2)  • 

A  statistical  point  of  view  would  use  a  discriminant  function  analysis  and 
construct  a  canonical  variable  of  the  form: 

D  -  Ybl  4  RYb2  , 

where  R  is  chosen  to  maximise  the  ratio,  treatment  sum  of  squares  divided 
treatment  plus  error  sums  of  squares. 

The  first  three  linear  combinations  given  above,  i.e.,  V,  C,  and  L  are 
readily  lntarpretable  quantities  by  a  researcher  or  a  farmer.  The  last  one 
D  is  not  and  sole  crop  yields  cannot  be  compared  with  D,  but  can  be  with  V, 
C,  and  L.  Although  a  statistician’s  first  thoughts  in  combining  yields 
most  likely  would  be  to  use  multivariste  analyses,  this  would  not  be  the 
correct  thing  to  do  as  comparisons  of  sole  crop  yields  and  farming  system 
yields  cannot  be  made  and  the  canonical  variable  has  no  practical  meaning 
in  the  sense  that  C,  V,  and  L  do.  Some  aspects  of  multivariate  analyses 
havo  been  found  useful  by  Pearce  and  Ollliver  (1978,  1979)  in  studying  the 
nature  of  response  from  mixtures. 


Statistical  analyses  for  linear  combinations  C,  V,  and  L,  are  straight¬ 


forward.  Those  outlined  in  the  previous  aection  may  be  utilised.  These 
created  functions  of  yield  may  be  used  in  the  same  manner  as  canonical 
variables  from  a  discriminant  function  analysis,  i.e.,  univariate  analyses 
are  performed  on  the  canonical  variables.  It  is  possible  to  combine  value 
and  land  use  by  taking  the  ratio  Y^jVj/Y^Vj  “  R  an<*  u,in8  the  created 
function  of  yields  Yj  +  RYj.  It  does  not  appear  realistic  to  combine 
variables  other  than  yield  variables  as  described  above. 

4.  Two  Main  Crops  -  Density  Variable 


Plant  populations  per  hectare  in  sole  crops  and  in  biblends  need  to  be 
considered  seriously  in  conducting  intercropping  investigations.  Crop 
densities  maximising  yields  Y^,  or  linear  combinations  of  yield  V,  C,  and 
D,  are  desired.  Using  univariate  analyses,  a  multiple  comparisons  or  sub¬ 
set  selection  procedure  may  be  used  to  pick  the  "optimal"  densities  for  the 
crops.  A  useful  procedure  would  be  to  model  yield  as  a  function  of  plant 
density.  Within  narrow  ranges  of  densities,  a  linear  approximation  of  the 
form  has  been  found  to  be  useful! 


llUk  *  *0i  *  "k  *  BUdU1  *  'il»k  • 

where  Y^^  i*  the  yield  of  the  ith  crop  as  a  sole  crop,  BQ1  is  an  inter¬ 

cept,  B^  is  a  linear  regression  coefficient,  d^  is  the  density  for 
crop  i,  Pjj  i*  the  effect  of  block  k,  and  ***  *  random  error  term  with 

mean  sero  end  variance  a*.  Note  that  a  variety  of  other  functional  rela¬ 
tions  could  be  used  to  model  yield  as  a  function  of  density.  Using  the 

above  form,  the  yields  of  crop  i  in  tho  mixture  ij  of  two  crops  may  be 


expressed  as 


Yi( J >A1A2k  "  ®0i  +  pk  *  8lidijl1  *  Ti( J)(di|t1  *  dJJlj)  * 

where  yw/4»(djt  »  d.,  )  la  an  additive  effect  on  the  yield  of  crop  1  due  to 

J ^ j 

ita  being  intercropped  with  crop  j  at  the  corresponding  densities  d 


“i 

and  d.,  .  A  large  positive  value  of  y4/i\(d.t  ,  d.,  )  is  desired. 

Jrj  *'J' 

When  there  are  many  llnea  of  a  cultivar  in  an  investigation  the  above 


analysis  nay  be  conducted  for  each  line.  Then,  analyses  over  all  lines  can 
be  obtained. 

5.  Modeling  Responses  for  Sola  Crops  and  Blblenda  ~  Two  Responses 

In  many  situations,  responaas  for  both  components  of  a  mixturn  are 
available.  The  crops  may  be  intermingled  but  distinct  in  type  so  that 
responses  for  each  crop  are  obtained,  or  the  crops  may  be  spatially  sepa- 
rated  and  again  responses  for  each  crop  ate  available.  For  treatment  de¬ 
signs  containing  all  sole  crops  and  all  possible  combinations  of  lines  of 
crops  in  mixtures  of  two,  response  model  actuations  can  be  constructed  which 
have  Neaeures  of  a  general  mixing  ability  (gma)  effect  and  of  a  specific 
nixing  ability  (sma)  effect  of  a  line  or  crop.  To  illustrate,  supposo  that 
it  was  desired  to  compare  yields  of  v  ■  five  bean  cultivars  as  sole  crops 
and  in  mixtures  of  two.  The  v(v  +  1 ) / 2  ■  15  combinations  would  be! 


Cultivar 

12  3  4  5 

1 

8  B  B  B  B 

2 

S  B  B  B 

3 

S  B  B 

4 

S  B 

5 

S 

where  S  stands  for  sole  crop  and  B  denotes  a  biblend.  With  auch  a  treat¬ 
ment  deaign  in  a  randomized  complete  block  doaign,  one  poaaible  linear 


model  ia: 

Sole  crop  1: 


Yhiia 


i 


i 

I 

I 


*  +  ph  +  Ti  +  *hiie* 


where  p  ♦  ia  a  block  mean  effect*  x^  ie  cultivar  effect*  and  ehii.  haVt 
zero  mean  and  common  variance  0* • 

Biblond  1,1 1 

Yhi(j)b  "  2  ()l  4  ph  *  Xi  *  V  *  7i(J)  *  *hi(J)b  * 

Yh(i)jb  "  2  ^  +  V  xj  +  )  *  r(i)j  +  *h(i)Jb 

where  Yhl^b  i-  th#  yi#ld  of  cultivar  i  from  the  mixture  ij,  p*  f>h*  end  xA 
ate  aa  defined  for  sole  crop*  4^  ia  a  general  combining  ability  effect  for 
cultivar  i  when  grown  in  biblende,  y^  ia  an  interaction  effect  for  crop  i 
in  the  preaence  of  crop  J,  and  the  *hl(j)b  •tror  componenta  for  cul¬ 
tivar  i  reaponaea  which  have  aero  mean  and  common  variance  o*/2.  The 
coefficient  1/2  la  included  in  order  to  have  the  p,  Ph*  x^  and  4A  from  the 
biblenda  on  the  same  baaia  aa  the  correaponding  parameter*  for  aole  cropa. 
With  two  cultivar  a  on  the  aame  area  of  land  aa  the  aole  cropa*  each  crop 
response  can  only  contribute  1/2  to  p*  P^,  and  x^.  Reaponae  model  equa¬ 
tions  can  easily  be  constructed  for  the  case  where  one  crop  occupies  a 
proportion  p  of  the  area  and  the  second  crop  occupies  1  -  p  of  the  area.  In 
this  case,  care  must  be  taken  in  defining  an  interaction  effect.  An 
interaction  la  defined  to  relote  to  two  itema  in  equal  proportions.  To 
interact,  both  muat  be  present.  When  p  <  1/2,  only  2p  of  the  total 
material  in  an  experimental  unit  is  available  to  interact  on  a  111  baaia; 


1  -  2p  of  the  material  la  not  available.  If  some  such  definition  aa  the 
above  ia  taken,  interaction  effects  will  be  invariant  with  respect  to 
changing  proportions  p. 

Note  that  when  other  treatment  designs  are  used,  other  models  can  be 
constructed.  For  example,  suppose  that  only  a  subset  of  the  v(v  -  1 ) / 2 
biblende  were  Included  in  a  experiment  along  with  solo  crops.  The  para¬ 
meters  p,  p^,  x^,  and  4^/2  ♦  ri(j)  “  *1(J)  c*n  ••timated.  It  Is  not 
possible  to  obtain  solutions  for  6^/2  and  but  only  their  sum.  If  the 
experimenter  ware  willing  to  assume  that  the  not  present  were  all 
sero,  then  solutions  are  possible.  This  ia  considered  to  be  an  unrealistic 
assumption. 

6.  Modeling  Responses  for  Sole  Crops  and  Blblanda  ~  One  Response 

For  certain  types  of  mixtures,  such  as,  e.g.,  a  dlallel  crossing 
experiment,  it  is  impossible  or  difficult  to  obtain  responses  for  both 
components  of  e  biblend.  Experiments  involving  sole  crops  and  jixturss  of 
two  lines  of  a  cultivar  where  the  lines  are  not  phenotypically  distinct  or 
are  not  spatially  separated  would  be  found  for  wheat,  beans,  and  many  other 
crops.  In  mixtures  of  grasses  and  legumes  in  hay  it  is  difficult  to  obtain 
the  separate  responses  for  each  member  in  the  mixture.  Several  response 
models  sro  available.  For  a  randomised  complete  block  design  end  the 
treatment  design  involving  sols  crops  and  all  possible  biblende,  the 
following  pair  of  equations  foe  sola  crop  and  biblend  yields  has  been 
proposed  (Federer  #f  sJ,  1982) i 

Yhii.  "  ■»  +  Ph  *  Ti  *  Chii. 

Yhijb  "  v  +  Ph  *  (ti  4  +  TJ  +  V/2  +  rU  *  *hi jb 
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where  the  effects  are  as  defined  in  the  previous  section  except  for  y^ 
which  is  an  interaction  component  for  specific  mixing  ability.  Note  that 
y  is  equal  to  the  sum  y^^  +  y(i)j’  Thes®  laBt  tw0  components  cannot  be 
estimated  unless  individual  responses  are  available  whereas  y^  can  be 
estimated  when  only  the  combined  response  is  available. 

Another  treatment  design  would  be  sole  crops,  all  combinations,  and 
all  reciprocals.  To  illustrate,  suppose  that  v  ■  5  wheat  varieties  are 
available,  and  the  experimenter  wishes  to  have  all  varieties  bordered  by 
every  other  variety  and  itself.  Responses  from  border  rows  are  not  ob¬ 
tained.  The  v*  ■  25  treatments  would  bas 


Border 

Wheat  Variety 

12  3  4 

5 

1 

S 

B 

B 

B 

B 

2 

B 

S 

B 

B 

B 

3 

B 

B 

S 

B 

B 

4 

B 

B 

B 

S 

B 

5 

B 

B 

B 

B 

S 

where  S  denotes  sole  crop  and  B  denotes  the  mixture.  Note  that  variety  1 
bordered  by  variety  2  is  not  the  same  as  variety  2  bordered  by  variety  1. 
One  set  of  response  models  for  sole  crop  and  biblends  respectively  is: 


‘hii 


U  +  Pv 


*  'l  *  ‘hU. 


and 

Yhijb  "  M  +  0h  +  xi  +  Si  +  7ij  *  'hijb.inj 

where  p,  Ph>  *nd  chijb  ar*  “  d#fin#d  lbov#  *nd  ia  * 

within  variety  interaction  term  with  y^  ■  0  ;  y^  is  an  interaction  term 
for  crop  i  when  bordered  by  crop  J. 
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A  second  response  model  equation  for  the  above  treatment  design  would 


be  the  one  for  a  two-factor  (crops  and  borders)  factorlel: 


Yhij  *  “  +  ph  +  °i  +  +  aBlj  +  ehij  * 

where  la  the  effect  of  crop  i,  Bj  ia  the  effect  of  border  j,  and  la 

an  Interaction  tern.  Such  a  model  would  not  be  too  realistic  In  a  variety 
of  aituatlona  since  sole  crop  responses  may  be  quite  different  from  blblend 

responses. 

A  third  model  ia  adapted  from  Martin  (1980)  and  le  the  previous  model 
with  the  following  change: 


aB 


ij 


'ij 


♦  u 


♦  r , 


ij  ij 


where  ■  n  for  1  ■  J  and 

(«Bij  +  a8ji)/2  +  n/(v-l)  for  i  # 


nij  "  for  1  *  j*  “jj  * 

J,  and  j  -  aBj^/2. 


A  fourth  model  ia  a  mixture  of  the  previous  ones  and  ia 


Yhijb  ■  *  *  ph  *  'i  *  *i  *  #j  ♦  “ij  *  "ij  *  *hijb  • 

where  Bj  and  <u^  are  similar  to  the  above  Bj  and  but  are  condi¬ 
tional  on  the  fact  that  <*B^  ■  0;  the  remaining  parameters  are  as  defined 
above . 

Other  situations  will  lead  to  the  construction  of  other  response  model 
equations.  Appropriate  models  will  need  to  be  constructed  for  the 
particular  conditions  encountered  in  an  investigation. 


Spatial  and  Density  Arrangements 


Spatial  arrangements  and  density  levels  are  very  important  items  to  I 

consider  in  intercropping  investigations.  By  spatial  arrangement,  we  mean  i 

t 

i 

the  pattern  used  for  plants  in  a  given  area  of  land.  The  plants  could  be  in  ! 

I 

rows,  in  hills,  or  drilled.  Tie  number  of  plants  per  hectare  could  be  | 

I 

varied  over  a  wide  range.  The  f< llowing  five  items  need  to  be  studied  for 
any  intercropping  investigation: 

(i)  spatial  arrangement  of  crop  one,  j 

(ii)  spatial  arrangement  cf  crop  two,  j 

i 

(iii)  density  of  crop  one,  j 

( 

(iv)  density  of  crop  two,  and  J 

(v)  intimacy  of  the  two  :rops.  ! 

By  intimacy  we  mean  the  closeness  of  plants  of  the  two  crops.  If  plants  of 
the  two  crops  are  randomly  mingled  in  the  same  row,  we  say  that  they  are 
100%  intimate.  Plants  of  th«  two  crops  in  separate  rows  would  be  leas 
intimate.  If  the  two  crops  were  isolated  far  enough  to  eliminate  any  j 

Interaction,  they  have  zero  intimacy.  To  illustrate,  suppose  that  denaity 
is  not  a  variable  but  Intimacy  and  spatial  arrangement  are.  One  plan  could 
be  to  have  two  crops,  say  maize  and  beans,  in  the  same  row  with  rows  one 

meter  apart.  A  second  plan  could  be  to  double  the  density  within  rows  and 

double  the  distance  between  rows.  The  density  per  hectare  and  Intimacy 
would  be  the  same  but  spatial  arrangement  would  be  different.  A  third  plan 
would  be  to  alternate  rows  of  the  two  crops.  The  intimacy  would  be  less 

than  in  the  first  two  plans.  Another  plan  commonly  used  for  maize  and 


bean*  in  Brazil  is  one  row  of  maize  and  two  row*  of  beans  alternating  aa 
below  (N  ■  maize  and  B  ■  beana): 


MBBMBBMBBMBBM • • •  . 

The  maize  rows  are  one  meter  apart  and  the  bean  rowa  are  one-half  meter 
apart.  A  fifth  plan  would  be: 

MMBBBBMMBBBBMMBBBBMM*  *  * 

The  paire  of  maize  rowa  are  1.75  metera  apart  and  the  rowa  rjf  a  pair  are 
0.25  meter  apart.  The  bean  rows  are  one-half  mater  apart.  The  last  plan 
could  be  the  beat  as  more  light  would  be  available  for  maize  and  for  bean 
plants  than  in  the  previous  plans.  The  rows  should  be  oriented  in  a 
north-south  direction  in  order  to  benefit  from  the  additional  light. 

Several  plans  are  available  co  study  wide  variations  in  density  with  a 
relativaly  small  amount  of  material.  They  should  be  used  to  obtain 
information  on  ranges  of  density  for  future  study.  The  best  known  of  these 
is  the  fan  design  of  Nelder  0962).  There  are  several  versions  of  this 
design.  Another  useful  design  has  been  suggested  by  B.  N.  Okigbo  (1978). 
The  design  is  a  circle  with  orientation  noted  (see  below).  A  small  circle 
in  the  center  is  not  used  as  some  space  In  needed  to  start  tha  rows.  The 
row  spacing  becomes  increasingly  distant  aa  one  moves  sway  from  the  center 
of  the  circle.  The  density  within  a  row  could  be  kept  constant  or  the 
density  per  hectare  could  be  kept  constunt  by  increasing  the  density  within 
a  row  as  one  moves  away  from  the  center  of  a  circle  of  the  following 
nature : 
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North 


South 


The  lines  abova  could  Indicate  tha  row a  of  plants.  Tha  abova  design  could 
bo  for  a  aingla  crop  or  for  alxturaa  of  two  crops  using  tha  previously 
daacribad  plans  for  spatial  arranganants  and  intinacy.  A  Naldar  fan  dasign 
would  bo  one-quarter  of  tha  above  and  would  be  used  if  directional  orienta¬ 
tion  wore  unimportant.  Both  the  Okigbo-circlo  and  the  Nelder-i'an  designs 
are  very  parsimonious  of  space.  One  statistical  analysis  would  be  to 
divide  tha  circle  into  concentric  circles  of  squal  areas.  Yialds  would 
then  be  obtained  for  the  areas  of  individual  rows.  The  results  could  be 
plotted  graphically  to  determine  optimal  yields  or  some  regression  function 
could  be  fitted  to  the  yields.  Optimal  row  distances  and  optimal  densities 
for  yield  could  then  be  obtained.  These  circles  or  fane  could  be  con¬ 
structed  for  various  cropping  systems  and  replicated  over  a  range  of 


conditioiiA  to  be  encountered  in  practice  It  may  be  pooaible  to  determine 


optimal  density,  spatial  arrangement,  and  intimacy  well  enough  so  that 
future  experimentation  is  not  necessary.  However,  it  is  likely  that  future 
experimentation  will  be  needed  to  more  precisely  determine  optimal  values. 

8 .  Variations  and  Additional  Analyses 

Many  and  diverse  situations  exist  in  intercropping  research.  One  such 
area  is  to  study  the  effect  of  replacing  one  crop  in  a  mixture  with  a 
second  crop  with  proportions  ranging  from  aero  to  one.  Olven  that  p^  is 
the  proportion  for  crop  a  and  1  -  ■  p^  is  the  proportion  of  crop  b  in 

the  mixture  and  •  yield  of  sole  crop  1,  the  computed  value  for  a 
atrictly  replacement  series  would  be  ♦  PbYsb*  the  yield  of  the 

mixture  at  proportion  (p  ,  p.  )  was  greater  than  this  value,  this  would  be 

a  o 

termed  cooperation.  If  less,  then  denote  this  as  Inhibition.  If  one  crop 
is  inhibited  and  the  other  exhibits  cooperation,  this  would  be  denoted  aa 
compensation  since  the  yield  of  one  crop  is  increased  and  the  other  is 
decreased.  For  intercropping,  proportions  and  crops  showing  a  large  amount 
of  cooperation  are  desired. 

Several  other  statistic*  have  been  developed  for  competition  etudiea. 

A  number  of  them  are  related  to  e  lend  equivalent  ratio. 

Let  Ybi/Y.i  *  Li  which  ia  tho  proportional  yield  of  the  crop  in  e 

mixture  relative  to  the  crop  grown  alone.  A  lend  equivalent  ratio  is  L  • 

Lj  ♦  L2.  A  statistic  was  devsloped  to  compute  total  affactlva  art*  for  the 

ceae  where  A.  •  area  devoted  to  sole  crop  i  and  A  ■  area  devotsd  to  the 
i  m 

mixture  of  the  two  cropa.  Then,  total  effective  area  is  computed  ae 
Aj  +  A.j  *  LA^.  A  ralativa  crowding  coafficiant  la  computed  aa 
L i  L 2 /  C 1  -  L,)(  i  -  .  A  coafficiant  of  aggraaaivity  to  meaaure  the 
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A  compe  tJtj  ve 


dominance  of  one  crop  over  another  ia  computed  as  Lj  - 
ratio  index  ia  given  by  Lj/Lj.  Each  of  these  can  be  adjusted  to  the 
relative  proportions  p  :  p,  of  crop  a  and  crop  b  in  the  mixture.  Other 

A  D 

coefficients  have  been  suggested.  A  number  relate  to  crop  stability  (an 
ill-defined  tern)  and  to  "risk  to  farmers".  Survival  farming  must  take 
some  form  of  these  measures  into  account  as  s  farmer  needs  to  produce  food 
every  year  in  order  to  survive. 

Another  type  of  analysis  suggested  by  B.  R.  Trenbath  in  his  discussion 
of  the  Head  and  Riley  (1981)  paper  is  linear  programming.  Here  yields  of 
the  crops  as  sols  crops  and  in  mixtures  is  required.  Then  for  a  goal,  say 
S  units  of  starch  and  P  units  of  protein,  cn  optimal  allocation  of  area  to 
sole  crops  and  to  mixtures  can  be  computed.  A  farmer  can  minimice  land 
area  needed  to  reach  hie  primary  goal  (food  production)  and  can  use  the 
remaining  area  of  his  farm  for  crops  to  achieve  a  secondary  goal  (say 
produce  for  sale).  Economic  studies  make  use  of  linear  programming  for 
some  of  their  investigations. 

9.  One  Main  Crop  with  Two  or  More  Supplementary  Crops 

Consideration  of  mixtures  for  more  than  tvo  crops  in  the  mixture  would 
at  first  sight  appear  to  be  a  straightforward  extension  of  the  procedures 
for  tvo  crops.  This  is  not  the  case.  To  illustrate  this  for  one  main  crop 
with  supplementary  crops,  it  would  appear  that  one  could  simply  follow  the 
procedures  described  in  Section  2,  but  consider  tho  following  treatment 
design  and  example.  Barley  was  the  main  crop  and  only  one  barley  variety 
was  includod  in  the  experimental  units  along  with  barley  in  combinations  of 
one  cultivar  plus  barley,  all  possible  combinations  of  three  of  tho  six 


barley.  Plant  number*  per  experimental  unit  were  kept  constant  and  the 
same  number  of  barley  plants  were  harvested  in  every  experimental  unit. 
Barley  as  a  sole  crop  was  one  of  the  treatments.  In  all  there  were  1+6+ 
20  +  1  ■  28  treatments.  For  a  randomised  complete  block  design  and  barley 
yields  for  variety  g  (one  variety),  one  set  of  response  equations  is: 


Sole  crop  -  variety 
Y 


’  .  -  ■  u  +  t  +  p.  ♦  e  .  . 
ghO  r  g  h  gH 


Variety  g  plus  one  crop  i 

>h  T  ’i 


Y  .  . ,  ■  p  +  t  +  P.  ♦  ♦  t  .  . 

ghil  r  g  h  i  ghi 


Variety  g  plus  two  crops  i  and  ,1 

YghiJ2  "  4  ‘‘g  4  eh  4  <4i  4  4J>/2  4  yij  4  *ghij  * 

Variety  g  plus  three  crops  1,  j,  and  k 

Y,Mjk3  '»*',*  <*h  *  <‘l  *  4j  *  V'3  *  2('ij  *  ’lk  ‘  V/3 

•  *  lijk  *  *,hijk  ' 


Variety  s  plus  all  cultivars 


ghij 


p  +  t  +  p,  +6  +  r  +1  +  ••• 

oh... 


+  X 


12-  •  «v 


+  c 


ghij 


For  the  above  example,  mixtures  of  barley  with  two  other  crops  were  not 
included  in  the  experiment,  p  +  is  the  mean  for  barley  variety  g  grown 
as  s  sole  crop,  p^  is  the  li'th  block  effect,  4^  is  a  general  mixing  effect 
of  crop  i  on  barley  yields  Y  is  a  bi-specific  mixing  effect  of  the 
combination  of  crops  1  and  J  on  the  yield  of  barley,  is  a  tri-specific 
effect  of  the  combination  of  crops  i,  j,  and  k  on  the  yield  of  barley, 
*22... v  *  v-specific  mixing  effect  of  the  combination  of  all  v  crops  on 


Che  yield  of  berley,  and  all  the  es  are  considered  to  have  mean  zero  and 
coamon  variance  c*  .  Tha  assumption  of  common  variance  appears  to  be  a 
realistic  one  for  this  experiment  Involving  only  barley  yields. 


10.  Three  or  More  Main  Crops  ~  Density  Constant 

A  first  step  in  analysing  data  from  an  intercropping  experiment 
containing  mixtures  of  three  or  sore  crops  Is  to  obtain  statistical 
analyses  for  each  crop  separately.  The  method  of  Section  9  may  be  used  for 
this  when  appropriate.  Response  model  equations  for  such  experiments 
designed  In  a  randomised  complete  blocks  design,  found  useful  are: 


Sole  crop  g  (h  •  1,2, ••*,rj  1  •  l,2,*«*,c  ): 
_ S_ 

Y  ..  ■  u  ♦p.+t.  +  s..  . 
ghi  *g  gh  gl  ghl 

Mixtures  of  three  in  proportions  p^spgip^,  p^  i  Pj  i  p^)* 

Crop  1  yield,  i'th  line 

Ylhi(jk)  "  pl(pl  +  plh  +  *li  +  *li)  +  2p2ri(j)  *  2p3ri(k) 

+  3p3*i( jk)  +  elhi(Jk)  ' 

Crop  2  yield,  J ' th  line 

Y2h(i)J(k)  “  p2(p2  *  p2h  +  X2j  +  42j}  *  2p27(i)J  +  2p3rj(k) 

+  3p3*(i)J<k)  *  C2h( i) j(k)  ' 

Crop  3  yield,  k'th  line 


Y3h(ij)k  ’  p3(p3  +  P3h  *  T3k 
+  3p3*(iJ)k 

where  interaction  effects  * 

of  material  on  an  area  basis,  c 


♦  «Jk>  *  !p3(r(1)k  ♦  r(J)k> 


+  C 


3h(ij)k  ’ 


i(Jk) 


,  etc.  are  defined  for  equal  amounts 


ghi 


have  sero  mean  and  common  vari/ 


nee 
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°gc*  Cghi(jk)  ^ave  *®ro  me®n  and  common  variance  ■  o^p^,  Pgh 

a  block  affect  for  crop  g,  la  a  mean  effect  for  crop  g,  and  the  sub- 

acripts  ir  parentheses  denote  the  other  two  crops  in  a  mixture.  Crops  g, 

g*,  and  g*  were  taken  to  be  1,  2,  and  3,  respectively.  The  i'th  line  of 

crop  g,  the  J’th  line  of  crop  g*,  and  the  k'th  line  of  crop  g'  is  used.  In 

experiments  analysed  to  date,  only  one  line  of  each  crop  was  Included  but 

the  above  equations  are  written  to  allow  for  one  to  c  lines  of  each  crop. 

f 

Also,  note  that  each  crop's  contribution  to  an  interaction  term  can  be 
estimated. 

The  construction  of  created  variables  as  a  linear  combination  of 

yields  is  straightforward  from  the  two  crop  situation.  For  crop  value,  one 

uses  E^v^Y^  instead  of  VjY^  ♦  v2^2*  Or,  Vg  be  made 

proportional  to  a  base  crop  value,  aay  Vj {  the  created  relative  value  will 
c 

b«  El^Vg/Vj).  For  calorie  (or  protein)  value,  the  created  variable 
c  c 

EjCgYg  or  ^iYg(cg/ci)  would  be  used.  For  land  use  values,  the  linear 
combination  of  yields  t^/Yg,  -  1^,  or  EjYgb< Y1  s/Yg. >  "  YlsElLi 
would  be  used  for  Ygb  ■  yield  of  crop  g  in  a  mixture  and  Yg^  ■  yield  of 
crop  g  as  a  sole  crop. 

Multivariate  discriminant  function  analyses  are  not  usable  (se< 
Federer  ar.d  Murty,  1984)  for  analysing  data  from  intercropping  experiments. 
Multivariate  theory  needs  considerable  extension  before  it  can  be  used. 
Problems  of  missing  valuea,  comparisons  of  sole  crops  with  linear  combi: n- 
tions  of  aome  of  the  crops,  comparisons  of  different  linear  combinations, 
and  the  practical  Interpretation  of  the  linear  combination  appear  to  make 
present  concepts  of  multivariate  theory  unusable  for  intercropping  data. 
Satisfying  mathematical  considerations  and  not  practical  interpretations  is 
a  vacuous  solution  for  an  experimenter  trying  to  interpret  results  from  an 
experiment . 
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11.  Three  or  More  Main  Crops  —  Density  Variable 


With  only  two  crops  in  a  mixture,  the  assumption  that  the  sole  crop 
regression  of  yield  on  density  holds  for  all  densities  of  the  second  crop 
may  be  tenable  in  a  snail  region  of  densities.  With  more  than  two  crops  in 
a  mixture  and  with  varying  densities,  this  assumption  may  not  be  appropri¬ 
ate.  To  illustrate,  consider  mixtures  of  three  crops  gg*g'  for  g,  g*, 

g'  ■  1 , • •  • , c  crops  at  densities  d.  ,  d.  *,  and  d.  ,  for  i  ■  l,***,c  ,  J  ■ 

lg  Jgw  kg  g 

1 ,  •  • • , c  j,,  and  k  ■  l,»**,c  ,.  The  regressions  could  be  obtained  for  each 

gW  g 

of  the  c  . c  ,  density  combinations  and  not  Just  the  sole  crop.  These 

gW  g 

regressions  could  be  compared  for  homogeneity  to  ascertain  whether  the  sole 

crop  regression  is  appropriate  for  mixtures  of  three.  If  the  regressions 

can  be  considered  to  be  homogeneous  or  relatively  so,  the  following 

response  model  equation  for  the  yield  of  density  combination  (d.  ,  d.  *, 

18  J  8 

d^g,)  may  be  expressed  as: 

ViUkJ^ig^jg^kg'*  "  ®0g  +  pgh  +  *lgdig 

+  ri(Jk)(dig,djg*,dkg')  *  eghi( jk)(dig,dJg*,dkg' } 


where  i  •  l,***,cg,  J  ■  l,***,c^w,  and  k  ■  l,*“,c  , ,  BQg,  Pgh,  and  Blg  are 

as  defined  in  Section  4,  and  c  ,  ....  .  (d.  ,d.  .,d.  ,)  have  zero  mean  and 

ghi(Jk)  lg  Jg*  kg 

common  variance  The  y  ^  j  ^  (d^g  ,  ^g**'*^ ,  >  m*y  be  partitioned  into 

an  overall  effect,  an  effect  of  crop  g*  at  density  J,  an  effect  of  crop  g' 
at  density  k,  and  an  interaction  effect  for  the  jk'th  densities  of  crops  g* 
and  g1.  These  effects  would  relate  to  the  yields  of  crop  g. 


1 2 .  Model! ng  Responses  for  Mixtures  of  Three  or  More  Crops  —  Individual 
Crop  Responses  Available 

Various  response  models  for  mixtures  of  two  crops  were  discussed  in 
Section  5.  For  mixtures  of  three  of  c  cultivsrs,  say  1,  j,  and  k,  the 
following  i  sis  are  considered  plausible  for  consideration  using  a  RCBDi 

Sole  crop  i 

Yhi  •  "  ♦  'i  ♦  »h  *  'hi  • 

Mixture  ilk 
Crop  i  yield  ■ 

Yhi(Jk)  "  *  ph  *  ti  *  4i)/3  +  2(yi(J)  *  ri(k))/3 

+  *i( Jk)  *  ehi(Jk)  ‘ 

Crop  J  yield  ■ 

Yh(i)j<k)  "  +  Ph  ♦  tj  +  «j)/3  +  2(y(1)j  ♦  yj(k))/3 

+  *(i)J(k)  +  Ch(i)j(k)  • 

Crop  k  yield  • 

,h(lj)k  ‘  (»  *  »h  "k  *  V/3  *  2<r<i)k  *  r<j)k)/3 

*  \lj)k  *  'h<i})k  ' 

A  simpler  form  for  crop  1  yield  from  a  mixture  of  three  would  be 


hi(Jk) 


(p  +  Ph  +  Ti)/3  *  ’'i(jk)  +  Chi( Jk) 


where  yi(k)«  *i(jic)  *r*  combined  into  an  effect  j|c)  * 

The  interpretation  of  the  parameters  is  the  same  as  described  in  previous 


sections.  Solutions  for 


subject  to  usual  restrictions  may  be  obtained  when  all  possible  combina¬ 
tions  of  crops  are  present.  Otherwise,  it  is  recommended  that  the  above 
simpler  form  be  used. 


13. 


Madeline  Responses  for  Mixtures  of  Three  or  More  Crops  -  Individual 


Suppose  that  sole  crops  end  ell  possible  coabinetlona  of  three  of  the 
crops  represent  the  treetaents  in  e  RCBD.  Possible  response  model 
equations  arei 


Mixture  ilk 


*  ♦  ph +  (ti  ♦  6i +  xj  *  \  +  V/3 

+  2<yiJ  +  rik  *  rjk)/3  *  Wijk  *  *hijk  * 


If  all  combinations  were  not  present  the  model  for  mixtures  may  be  simpli¬ 
fied  to: 


Yhijk  *  »*  +  (Ti  +  Tj  +  V/3  +  *ijk  +  chljk 


where  a  sum  of  general  mixing  (4^,  bi-specific  mixing  (r^),  and  tri- 
specific  effects  would  be  reprasentad  in 

Several  other  models  described  in  Section  6  can  be  generalised  to 
consider  three  or  more  crops  in  a  mixture.  When  v’  combinations  of  lines 
of  three  crops  or  factors  are  present,  a  three-factor  factorial  model  may 
be  used.  Another  response  model  for  sole  crops  and  mixtures  of  three  crops 
1,  J ,  and  k  would  be: 


Sole  croc 


'hi.  *  »  *  %  *  ’i  *  'hi 


Mixture  ilk 


'hijkb  ■'■'VV'i'  Tijk  ♦  ‘hijk 


where  y^^  le  en  interaction  effect  within  erop  line  i  of  component  one  of 


the  mixture  for  linee  j  end  k  of  the  aecond  and  third  components.  Alterna¬ 


tively,  rijk  could  be  an  interaction  effect  within  the  combination  ij.  To 


illustrate,  suppose  that  four  lines  of  a  crop,  say  A,  B,  C,  D,  are  avail¬ 


able,  that  center  row  yields  only  will  be  obtained,  and  that  the  center 


rows  will  be  bordered  on  one  or  both  sides  by  every  line.  For  line  A,  the 


center  end  outside  rows  would  bs  AAA,  AAB,  AAC,  AAD,  BAB,  BAC,  BAD,  CAC, 


CAD,  DAD.  The  interaction  effects  would  be  the  deviations  of  the 


quantities  y  ....  -  y  .  .  ,  and  the  interaction  effects  X.._,  would  be 

1  *ijkb  ' *i«  *b  ABk 

the  difference  i  ?<ABCb  -  ytABDb. 


Martin  (1980)  states  that  his  model  does  not  extend  to  a  three-factor 


factorial.  A  response  model  for  a  two-factor  factorial  in  a  RCBD  would  bel 
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Martin's  model  deals  with  functions  of  the  aB^.  A  corresponding  three- 


factor  factorial  response  model  would  bet 


Yhijk ■  * '  »h  *  *  Bj  *  Tk  ♦  “ij  *  “rik  *  "rjk  *  a,rijk  *  ‘h„k  • 


Construction  of  two-factor  responses  and  using  the  previous  model,  aB^, 


ay,,  ,  end  By.,  can  ell  be  partitioned.  Partitioning  of  the  three-factor 


interaction  aBy^.^  does  not  appear  to  be  straightforward.  One  could 
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collapse  two  of  the  factors  Into  a  single  category  and  apply  the  previous 
Martin  model.  The  othar  models  discussed  in  Section  6  can  likewise  be 


extended. 

U.  Spatial.  Density,  and  Intimacy  Arrangements  for  Three  or  More  Crops 

For  two  crops,  arrangements  have  been  constructed  to  have  one  plant  of 
crop  one  bordered  by  aero,  one,  two,  three,  and  four  plants  of  the  second 
crop  an  equal  number  of  times.  Comparable  plana  for  three  or  more  crops 
have  not  been  deviaed  to  data.  Aa  long  aa  all  plants  of  three  or  more 
cultivars  (crops,  lines  of  a  crop,  etc.)  are  randomly  intermingled  in  an 
experimental  unit.  no  difficulty  arises.  As  soon  aa  cultivars  are  placed 
in  rows  or  planted  in  patterns,  apatial  patterns  must  be  thoughtfully 
considered.  The  following  items  must  be  investigated  for  three  crops! 

(i)  density  of  crop  one, 

(il)  density  of  crop  two, 

(ill)  density  of  crop  three, 

(iv)  spatial  arrangement  of  crop  one, 

(v)  spatial  arrangement  of  crop  two, 

(vi)  apatial  arrangement  of  crop  three, 

(vii)  intimacy  of  crops  one  end  two, 

(vlil)  intimacy  of  crops  one  and  throe,  and 

(ix)  intimacy  of  crops  two  and  three. 

When  using  the  Nelder  fan  or  the  Oklgbo  wheel,  care  must  be  taken  in 
investigating  orientation,  density,  spatial,  and  intimacy  relations.  These 
designs  will  be  parsimonious  of  space  and  should  be  considered  as  obtaining 
preliminary  results.  More  extensive  Investigation  will  more  than  likely  be 
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required  in  order  to  determine  optimal  conditions.  The  above  considera¬ 
tions  hold  for  mixtures  of  k  of  v  crops. 

15.  Additional  Statistics  for  Mixtures  of  Three  or  More  Crops 

Many  of  the  statistics  described  in  Section  8  may  be  extended  to 
consider  mixtures  of  three  or  more  crops.  The  total  effective  area  under 
three  crops  as  sole  crops,  in  mixtures  of  two,  and  in  a  mixture  of  three 
would  be; 


A1  +  A2  4  A3  +  L12Aml2  *  L13Aml3  +  L23Am23  +  L123Aml23 


where  A.  ■  area  under  eole  crop  i,  A  .  .  ■  area  under  mixture  of  two  crops  i 
i  mij 

and  .1,  j 3  *  area  under  the  mixture  of  thv  ee  crops,  «  land  equivalent 
ratio  for  mixtures  of  cropa  i  and  j,  and  Ljjj  ie  *  land  equivalent  ratio 
for  mixtures  of  the  three  crops. 

A  coefficient  of  agreaaivity  for  two  crops  in  equal  proportions  of 
land  area  is  Lj  -  L^.  For  three  crops  it  would  be  Lj  -  (L^  +  L3 ) / 2  for 
crop  1  ,  -  (Lj  ♦  L^)/ 2  for  crop  2,  and  Lj  -  (Lj  +  L^)I2  for  crop  3. 
Extension  to  k  cropa  is  straightforward.  ■  yield  of  crop  1  mixture 
divided  by  yield  of  crop  i  as  a  sole  crop. 

A  competitive  ratio  index  for  two  crops  in  equal  proportions  of  land 
area  is  Lj/L^.  For  three  crops,  it  would  be  2Lj/(1,2  +  L^)  ,  21*2^1^  ♦  L^) , 
and  2L.j/(Lj  ♦  Lj)  for  crops  1,  2,  and  3,  respectively. 

A  relative  crowding  coefficient  for  two  crops  is  L^Lj/d  -  L^)(l  -  Lj). 


It 

For  k  crops  in  a  mixture,  the  coefficient  would  be  Il^L^/d  -  L^). 

Graphical  representations  for  linear  programming  can  be  made  for 
mixtures  of  two  and  three  cropa,  but  not  for  mixtures  of  four  or  more 
crops.  However,  linear  programming  techniques  allow  for  k  crops  in  a 
mixture  and  as  sole  crops. 
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16.  Other  Mixtures  Where  Statistical  Techniques  Are  Useful 


There  are  a  large  number  of  areas  where  the  ideas  and  statistical 
procedures  developed  for  intercropping  can  be  used.  For  example,  consider 
a  survey  sampling  situation  where  answers  are  sought  to  sensitive,  incrimi¬ 
nating,  and/or  embarrassing  questions.  Direct  questioning  will  not  allow 
the  surveyor  to  obtain  this  information.  Anonymity  of  response  is  essen¬ 
tial  in  order  to  obtain  the  information.  Raghavarao  and  Federor  (1979) 
have  shown  how  to  use  the  block  total  response  procedure  using  supplemented 


and  balanced  incomplete  block  designs  to  obtain  sensitive  information.  The 
respondent  is  required  to  give  a  total  of  answers  to  k  of  v  questions. 
From  the  various  block  totals,  estimates  for  the  sample  can  be  obtained 


without  knowing  individual  responses.  This  is  similar  to  knowing  only  the 
total  response  for  a  mixture  rather  than  having  the  individual  mixture 
component  responses. 

Other  areas  where  these  ideas  can  be  utilized  is  in  applications  of 
drugs,  therapies,  medicines,  recreational  programs,  physical  training 
programs,  educational  programs,  using  sequences  of  courses  and  other 


mixtures,  nutritional  studies,  use  of  pesticide  and  herbicide  mixtures,  and 
any  other  area  where  mixtures  of  components  are  involved.  Studies  in  these 
areas  to  date  have  centered  on  mean  comparisons  of  single  or  similar 


M 


components,  upon  single  responses  for  the  mixture,  and  standard  statistical 
procedures.  Modeling  aspects  and  competitive  arpecta  have  been  Ignored. 
Statistical  theory  has  not  provided  adequate  statistical  methodology  to  do 


more  than  what  is  being  Hon-.  It  is  a  fruitful  area  for  future  research 
and  application. 
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SCORING  CASUALTIES  FROM  FIELD  TRIALS 


Carl  T.  Russell 
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Falls  Church,  Virginia 
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Fort  Ord,  California 

ABSTRACT .  Real  Time  Casualty  Assessment  CRTCA)  is  often  used  to 
"shape  the  battle"  in  Army  operational  tests  by  simulating  attrition  in 
near  real  time  as  a  function  of  measured  engagement  conditions.  Based  on 
engagement  conditions  measured  by  test  instrumentation,  a  computer  obtains 
a  Pk  from  a  table  of  kill  probabilities  and  draws  a  random  number  against 
that  Pk  to  determine  whether  the  target  player  lives  or  dies.  Both  firing 
and  target  players  are  given  near  real  time  feedback  concerning  the 
result,  and  "dead"  players  are  removed  from  the  battle  as  quickly  as 
possible.  As  long  as  the  attrition  rates  used  real  time  are  approximately 
correct,  RTCA  encourages  realistic  engagement  conditions  by  generally 
rewarding  smart  player  actions  and  penalizing  dumb  ones.  That  is, 
"approximately  correct"  attrition  rates  suffice  to  "shape  the  battle." 
However,  if  test  measures  of  effectiveness  involve  force  losses,  attrition 
rates  which  are  only  approximately  correct  are  not  good  enough.  Post-test 
analysis  of  the  battle  typically  identifies  engagements  which  either  were 
improperly  recorded  by  test  instrumentation  or  were  partially  garbled 
during  real  time  computer  processing.  Alternatively,  the  analyst  may  be 
asked  to  estimate  what  force  losses  would  have  been  with  smaller  or  larger 
Pk’s  for  some  players.  Once  the  actual  engagement  conditions  are 
determined  post  test,  an  actual  or  hypothetical  Pk  (PKA)  can  be  determined 
and  compared  to  the  Pk  used  real  time  CPKU).  Whenever  PKA  differs  from 
PKU,  the  attrition  rate  used  real  time  was  inappropriate  and  may  have 
started  a  cascade  of  misleading  real  time  losses.  The  analytic  goal  is  to 
estimate  what  expected  losses  would  have  been  if  live  ordnance  fhaving 
true  Pk=PKA)  had  been  used.  "Aliveness  analysis”  is  a  computational 
technique  which  attempts  to  meet  this  goal  by  crediting  kills  adjusted  for 
cumulative  differences  between  PKA  and  PKU.  The  technique  originated  at 
CDEC  and  was  modified  for  application  to  the  SGT  York  Follow  on  Evaluation 
conducted  in  April-May  1985.  This  paper  discusses  the  aliveness  analysis 
technique  and  illustrates  the  technique  using  examples  based  on  this 
SGT  York  testing. 


REAL  TIME  CASUALTY  ASSESSMENT 

RTCA  Description.  Army  use  of  Real  Time  Casualty  Assessment  CRTCA) 
originated  at  the  Combat  Development  Experimentation  Center  fCDEC)  in  the 
1970’s  and  is  used  extensively  in  tests  conducted  on  CDEC  test  ranges  at 
Fort  Hunter  Liggett,  California.  RTCA  is  an  instrumented  testing 
technique  which  shapes  the  battle  by  simulating  kills  in  near  real  time. 
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Preceding  Page  Blank 


Fitld  trial*  at  CDEC  art  generally  two- tided  free-play  trials  up  to 
battalion  versus  company  in  size,  and  they  may  involve  armor,  infantry, 
aviation,  air  defense,  field  artillery,  mines,  or  chemical  equipment.  The 
trials  are  conducted  under  conditions  as  nearly  realistic  as  possible,  and 
they  are  highly  instrumented  for  trial  control,  safety,  and  data 
collection.  The  computerized  instrumentation  consists  of  a  number  of 
fixed  stations  CA  stations)  which  poll  transponders  mounted  on  players  (B 
stations)  and  transmit  position  location  and  other  data  back  to  a  central 
computer  (C  station)  for  processing  (see  Figure  1).  Players  are 
instrumented,  typically  using  coded  lasers,  so  that  when  on*  player  fires 
at  another,  the  target  can  be  identified.  During  the  play  of  a  test 
battle,  RTCA  encourages  realistic  engagement  conditions  by  generally 
rewarding  smart  player  actions  end  penalizing  dumb  ones.  It  does  this 
through  a  three  step  process  (see  Figure  2)i 

-  When  an  engagement  occurs  (i.e.,  one  player  "fires"  at  another),  a 
coded  laser  is  firod  at  the  targat.  The  B  station  on  the  firer 
tells  the  computer  “I  filed"  and  the  type  of  ammo  used.  If  a  target 
player  is  paired  with  the  firer,  the  B  station  on  the  target  tells 
the  computer  'Tv*  been  engaged,"  the  code  of  the  laser,  and  the 
sensors  illuminated. 

*  The  computer  receives  the  engagement  information  end  analyzes  it  in 
terms  of  variables  which  affect  the  probability  of  kill  (typically, 
the  nature  of  the  firer,  the  nature  of  the  target,  the  range,  the 
ammunition  used,  firer  and  target  movement,  target  exposure,  and 
target  aspect).  Tables  of  "kill  probability"  (Pk)  determined  by 
pretest  modeling  are  then  used  to  determine  the  Pk  associated  with 
the  crucial  engagement  parameters.  This  Pk  is  then  used  to  simulate 
a  "kill"  or  a  "survive"  via  Monte  Carlo.  That  is,  the  computer 
draws  a  random  number  against  the  looked-up  Pk,  killing  the  target 
if  the  random  number  is  smaller  than  the  Pk. 

-  This  simulated  engagement  result  is  fed  beck  to  both  firer  and 
target  in  the  engagement,  usually  within  a  few  seconds  of  the 
original  firing.  Dead  targets  either  stop  (ground  players)  or  leave 
the  battlefield  as  soon  as  possible  (air  players).  Dead  players  are 
typically  marked  by  strobes  or  smoke  and  their  ability  to  fire  at 
others  is  disabled. 

RTCA  Interpretation.  Field  trials  as  conducted  at  CDEC  are 
simulations  of  actual  combat,  not  reality  itself.  Representative  battle 
initial  conditions  are  determined  prior  to  the  start  of  testing,  and  RTCA 
is  used  to  shape  the  test  battle  so  that  post-test  estimates  of  attrition 
will  provide  reasonable  predictive  insight  to  actual  battle  outcomes  (see 
Figure  3).  The  role  of  RTCA  in  this  simulation  is  as  a  tool  to  encourage 
sequences  of  individual  engagement  conditions  representative  of  combat 
under  the  specified  initial  conditions. 
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Problem  Statement.  How  should  attrition  be  estimated  In  the  context 
of  RTCA?  The  first  inclination  of  an  analyst  accustomed  to  binomial 
coin-tossing  experiments  is  to  count  simulated  kills.  Then  approach  is 
wrong  because  it  adds  unnoce/.sa) y  variability  to  attrition  estimates.  The 
right  approach  is  to  sums  of  Pk’s. 


Simplified  ExampU  -  Consider  for  example  a  simplified  example  in  an 
air  defense  context.  :'hree  Red  aircraft  (a  rotor  wing  labelled  R  and 
fixed  wings  labelled  S  and  T1  attack  a  Blue  armor  force  consisting  of  five 
tanks  (labelled  1  through  51  and  two  air  defense  weapons  (labelled  A  and 
Bl.  Imagine  that  five  engagements  ensue  (see  Table  lit 


At  1  minute  into  the  battle.  Red  rotor  wing  R  engages  tank  1  under 
engagement  conditions  which  give  a  Pk  of  0.6.  The  computer  draws 
the  random  number  0.8635  (or  some  suchl  against  0.6,  so  tank  1 
survives. 


At  3  minutes  into  the  battle,  Red  rotor  wing  R  engages  tank  2  under 
engagement  conditions  which  give  a  Pk  of  0.7.  A  random  number  less 
than  0.7  is  drawn,  so  tank  2  is  killed. 


At  4  minutes  into  the  battle,  Red  fixed  wing  B  engages  tank  3  under 
engagement  conditions  which  give  a  Pk  of  0.1,  but  a  random  number 
larger  than  0.1  is  drawn,  so  tank  3  survives. 


At  5  minutes  into  the  battle,  Blue  air  defense  weapon  B  engages 
fixed  wing  T  wl th  a  Pk  of  0.8,  but  T  survives. 


Finally,  at  7  minutes  into  the  battle,  Blue  air  defense  weapon  A 
engages  rotor  wing  R  with  a  Pk  of  0.3,  but  R  survives. 


The  RTCA  body  count  gives  one  Blue  and  no  Red  killed,  but  a  better 
attrition  estimate  is  clearly  available.  The  expected  kill  on  each 
engagement  is  known  once  the  Pk  is  kr.own,  so  a  partial  kill  equal  co  the 
observed  Pk  should  be  credited  at  each  engagement  Overall  expected  kills 
should  be  calculated  by  summing  there  credited  kills.  That  is,  sums  of 
Pk’s  should  be  used. 


Trick  Question.  Estimating  attrition  from  RTCA  data  is  like 
estimating  the  expected  number  of  "3’s"  from  an  experiment  where  a  fair 
die  is  rolled  twice  with  a  "5"  and  a  "1”  observed.  The  expected  number  of 
"3’s"  could  be  estimated  as  zero,  and  that  estimation  procedure  would  be 
unbiased  because,  in  the  long  run,  the  average  number  of  ",Vs"  obtained  in 
two  rolls  of  a  fair  die  would  be  one  third.  However,  since  it  is  given 
that  the  die  is  fair,  the  probability  of  rolling  a  "3"  on  any  one  roll  is 
known  to  be  one  sixth  so  the  expected  number  of  "3's"  must  be  one  third. 
That  "5"  and  "1"  were  rolled  in  on#  experiment  is  irrelevunt.  The 
expected  number  of  "3's"  should  be  estimated  as  one  third  because  the 
relevant  probability  is  known  in  advance. 
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Table  1 

Simplified  Example  Showing  that 
Seats  ef  Pk's  Sheet d  be  Used  to  Estfaiate  Attrition 


Crucial  Points.  In  summary,  chert  are  three  crucial  points  to  be 
remembered  when  making  attrition  estimates  from  RTCA  data. 

-  The  kill  probability  Pk  for  each  engagement  is  obtained  directly 
from  measured  engagement  conditions  via  a  a  Pk  table,  not  estimated 
from  simulated  kills. 

*  A  simulated  kill  or  survive  is  generated  by  a  draw  of  a  random 
number  against  engagement  Pk,  not  observed  directly  from  the 

engagement . 

*  RTCA  encourages  realistic  engagements  by  providing  Quick  feedback  to 
players  in  terms  of  simulated  kills,  but  attrition  should  be 
estimated  using  turns  of  Pk's,  not  sums  of  simulated  kills. 

OVERKILL  INHERENT  UITH  aUMH  OR*  Pk's 

Inherent  Overkill.  There  is  an  inherent  difficulty  with  estimating 
attrition  using  sums  of  Pk'ti  individual  players  or  groups  can  be 
"killed"  more  than  once.  This  fact  has  been  used  as  an  argument  against 
the  aliveness  formulas  which  will  be  discussed  shortly,  but  the  difficulty 
is  actually  inherent  with  simple  sums  of  Pk’s.  The  initial  reaction  of 
most  analysts  to  such  overkill  is  to  consider  it  intolerable  and  attempt 
to  modify  the  way  kills  are  credited  in  order  to  insure  that  no  more  than 
one  kill  is  ever  credited  against  an  individual  player.  There  is  at  least 
one  case  where  such  deflation  of  overkill  is  indeed  desirable.  In 
general,  however,  overkill  is  desirable.  The  following  discussion^ 

-  illustrates  how  overkill  can  occur  by  revisiting  the  simplified 
example  discussed  in  the  preceding  section, 

-  shows  how  to  deal  with  one  case  of  undesirable  overkill,  and 

-  gives  a  rather  elaborate  example,  in  terms  of  a  hypothetical 
experiment,  which  provides  a  test  for  any  estimation  method  proposed 
as  an  alternative  to  sums  of  Pk's. 

Revisited  Example.  If  in  the  simplified  example  of  Table  1,  rotor 
wing  R  had  fired  at  surviving  tank  1  a  second  time  rather  than  firing  at 
tank  2,  the  overall  attrition  estimates  should  not  change.  However,  1.3 
kills  must  be  credited  against  tank  1.  There  is  no  way  around  this 
problem  if  unbiased  estimates  of  attrition  are  desired.  Overkills  ere 
necessary  to  compensate  for  only  partial  kills  credited  against  totally 
dead  players,  as  originally  happened  against  tank  2  in  Table  li  only  0.7 
kill  was  credited  but  tank  2  was  forever  100*  dead. 

Undesirable  Overkill.  One  situation  where  overkill  Is  clearly 
undesirable  occurs  when  one  player  fires  several  rounds  at  another  player 
over  a  relatively  short  period  of  time  so  that  the  rounds  should  be 
considered  only  one  engagement.  Then  the  engagement  Pk  should  be 


calculated  using  appropriate  products  of  Pk's  rather  than  sums  of  £k's. 

In  particular,  if  n  rounds  are  fired,  each  with  Pk«p,  then  1-fl-p) 
should  be  used  rather  than  np  as  the  engagement  Pk.  As  long  as  np  is 
small,  the  difference  between  the  two  formulas  is  slight,  since 

np-tl-(l-p)n 1  cannot  exceed  (np)2  (proof ■  use  binomial  expansion). 

However,  if  np  is  relatively  large  (in  particular,  if  np  is  greater  than 
1),  then  the  difference  is  large.  For  example,  if  three  antitank  weapons 
are  fired  by  the  same  firer  against  the  same  tank  during  a  very  short 
period  of  time  and  each  firing  has  Pk*0.7,  then  np»2.1  but 
ri-(l-p)nl*0.973.  The  better  answer  is  clearly  0.973. 

n««<rahie  Overkill.  If  products  of  Pk's  rather  than  suras  of  Pk’s  were 
used  to  credit  kills,  no  more  than  one  credited  kill  could  ever  be 
accumulated  against  a  single  player.  Thus  overkill  could  be  avoided  if 
products  of  Pk’s  rather  than  sums  of  Pk's  were  always  used  to  credit  kills 
for  estimating  attrition.  The  following  example  shows  that  crediting 
kills  using  products  is  misleading  because  it  generally  underestimates 
expected  attrition.  In  addition,  because  it  provides  a  situation  with 
real  bullets  and  real  deaths  for  which  the  true  expected  kills  can  be 
calculated,  the  hypothetical  experiment  in  this  example  provides  a  test 
for  any  estimation  methe  proposed  as  an  alternative  to  sums  of  Pk’s  (see 
Figure  4). 

Hypothetical  Example. 

-  An  urn  contains  l,000,000(l-p)  harmless  blanks  (painted  green)  and 
l,000,000p  absolutely  lethal  rounds  (painted  red). 

-  Draw  100  rounds  at  random  from  this  urn  and  load  them  into  a  gun 
(which  conveniently  holds  100  rounds). 

-  Now  select  100  volunteers  and  shoot  the  rounds  at  them  from  point 
blank  range. 

As  long  as  care  it  taken  to  shoot  only  at  live  volunteers,  lOOp  volunteers 
are  expected  to  die.  The  number  cf  expected  kills  is  the  same  no  matter 
which  of  tha  following  methods  is  used  to  distribute  shots  among 
volunteers! 

-  Shoot  one  round  at  volunteer  »1.  Shoot  one  round  at  volunteer 
...  3hoot  one  round  at  volunteer  *100. 

-  Randomly  select  (with  replacement)  a  living  volunteer.  Shoot  one 
round.  Repeat  the  random  selection  and  shooting  of  one  round  until 
all  rounds  are  expended. 

-  Randomly  select  (without  replacement)  a  living  volunteer.  Bhoot 
rounds  at  this  volunteer  until  ail  rounds  are  expended  or  the 
volunteer  dies.  Repeat  until  all  rounds  are  expended. 


1,000,000(  l-p) 

harmless  blanks 
(palntad  grain) 


1,000,000p 

absolutely  lethal  rounds 
(pointed  rad) 


Random 

Selection 


Methods  for  Distributing 
Shots  Among  Voluntaars 

•  Shoot  one  round  at  volunteer  *1.  Shoofbne  round  at 
volunteer  *2  ...  Shoot  one  round  at  volunteer  *100. 


•  Randomly  select  (with  replacement)  a  living  volunteer. 
Shoot  one  round.  Repeat  the  random  selection  and 
shooting  of  one  round  until  all  rounds  are  expended. 

•  Randomly  select  (without  replacement)  a  living 
volunteer.  Shoot  rounds  at  this  volunteer  until  all 
rounds  are  expended  or  the  volunteer  dies.  Repeat 
until  all  rounds  are  expended. 


Flyura  4 

Hypothetical  Experiment: 

Any  Acceptable  Method  for  Estimating  Attrition 
Should  Give  lOOp  Expected  Kills 


To  be  convinced  that  all  three  methods  give  the  same  expected  kills, 
simply  consider  the  number  of  lethal  rounds  loaded.  If  R  red  rounds  fend 
100-R  green  rounds)  are  looded  into  the  gun,  then  exactly  H  volunteers 
will  die.  Thus  all  three  methods  give  the  same  number  of  kills,  hence  the 
same  expected  kills.  Moreover,  all  100  rounds  will  be  fired  with  each 
method.  The  random  variable  R  has  essentially  binomial  distribution  with 
success  probability  p  and  N"100  (actually  the  distribution  is 
hyper geometric) ,  and  the  expected  value  is  lOOp.  Any  acceptable  method  of 
estimating  attrition  using  Pk’s  should  estimate  the  true  expected  kills, 
and  sums  of  Pk’s  does.  Since  p  is  known,  the  expected  value  can  be 
correctly  estimated  by  crediting  a  partial  kill  equal  to  p  on  each  shot 
and  adding  credited  kills  (sums  of  Pk’s)  to  give  lOOp.  If  p  is  of 
moderate  size,  then  crediting  more  than  one  kill  against  some  volunteers 
is  virtually  certain  using  either  the  second  or  third  methodsi  in 
particular,  if  p*0.8,  overkill  would  occur  against  any  player  shot  at  more 
than  once,  and  both  the  second  and  third  methods  virtually  guarantee 
multiple  shots  against  some  players.  Applying  the  product  formula  would 
avoid  such  overkill,  but  would  give  the  wrong  estimate  for  expected 
kills.  In  fact,  the  product  formula  would  generally  give  different 
answers  for  each  of  the  shot  distribution  methods  If  0 < p <  1  and  ()<R<1()0, 
and  except  for  the  first  method,  the  answers  themselves  would  be  randomi 

-  For  the  first  method,  the  product  formula  would  credit  a  partial 
kill  equal  to  p  against  each  player,  giving  the  same  answer  as  sum 
of  Pk's,  namely,  lOOp. 

-  For  the  second  method,  the  product  formula  would  credit  p  kill  on 
the  first  shot  against  a  particular  player,  credit  p(l-p)  kill  on 
the  second  shot,  and  in  general  credit  p(l-p)(k“l)  kill  on  the  kth 
shot  against  a  particular  player.  Since  pfl-p)Ck-l)<p,  credited 
kills  would  be  less  than  lOOp  unless  all  players  were  shot  at 
exactly  once  (an  extremely  unlikely  occurence  unless  p  is  very  near 
1).  Since  at  least  R  different  players  must  be  fired  at  h  first 
time,  credited  kills  must  be  at  least  Rp.  The  remaining  ( 100-R) 
rounds  must  also  be  shot  and  the  smallest  number  of  kills  would  be 
credited  If  all  those  rounds  were  shot  against  a  single  player. 

Thus  the  smallest  number  of  kills  credited  by  the  second  method 
would  be  (R-l)p+(l-(l-p)100-R+l ] ,  and  the  actus!  number  of 
credited  kills  would  vary  randomly  from  this  number  to  lOOp. 

-  For  the  third  method,  the  smallest  number  of  credited  kills  Mould  be 
the  same  as  the  second  method,  but  the  largest  number  of  credited 
kills  would  be  strictly  less  than  lOOp  unless  R*99  and  the  last 
round  left  is  green.  In  general  the  third  method  would  give 
substantially  less  credited  kills  than  either  other  method  since  at. 
most  R+l  players  would  be  shot  at. 


Overkill  Urap-up.  The  preceding  discussion  shows  that  some  overkill 
must  be  allowed  when  estimating  attrition  in  an  RTCA  experiment.  As  long 
as  expected  kills  are  to  be  estimated  by  crediting  partial  kills,  some 
players  will  be  removed  from  play  with  less  than  a  whole  credited  kill,  so 
other  players  must  be  allowed  to  accumulate  more  than  a  whole  credited 
kill  to  make  up  for  the  shortfall. 


Uhv  Needed.  Until  now,  this  paper  has  argued  that:  sums  of  Pk’s  should 
be  used  to  estimate  attrition  from  RTCA  data.  However,  this  argument  was 
based  on  the  tacit  assumption  that  RTCA  works  as  advertised,  assessing  all 
or  almost  all  engagements  correctly.  Unfortunately,  many  engagements 
which  should  go  to  real  time  assessment  do  not,  typically  because  of 
instrumentation  failure,  faulty  real-time  position-location  data  or  buffer 
sync ronizat ion  problems  in  real-time  computer  processing.  In  addition, 
the  Pk’s  used  real  time  may  prove  to  be  incorrect  due  to  software  errors, 
errors  in  Pk  tables,  faulty  real-time  position-location  data  or  simply 
Inability  of  instrumentation  to  capture  crucial  engagement  conditions  in 
real  time.  Moreover,  Pk’s  may  change  post  test  either  because  new  data 
indicates  the  Pk  tables  should  be  modified  or  because  "what-if"  analyses 
of  Pk’s  are  desired.  In  fact,  there  are  effectively  two  Pk’s  associated 
with  each  RTCA  engagement,  the  Pk  used  real  time  (PKU)  and  the  actual  Pk 
(PKA)  determined  through  post-test  analysis.  Whenever  the  PKA’s  are  not 
equal  to  the  PKU’s,  the  attrition  rate  applied  real  time  was  wrong,  and 
too  many  or  too  few  players  were  left  on  the  test  battlefield.  Simply 
summing  the  Pk’s  (that  is,  the  PKA’s)  could  give  misleading  estimates  of 
attrition  if  PKA’s  were  frequently  unequal  to  PKU’s.  Aliveness  analysis 
is  an  arithmetic  adjustment  for  cumulative  differences  between  PKA’s  and 
PKU’s  which  is  applied  prior  to  summing  Pk’s.  It  is  essentially  a  back  of 
the  envelope  calculation  too  big  to  do  on  the  back  of  an  envelope. 

Adjustment  Approach.  Aliveness  analysis  makes  sensible  adjustments  to 
attrition  estimates  by  reducing  or  increasing  credited  kills  to  compensate 
for  cumulative  errors  in  attrition. 

-  If  PKU  is  less  than  PKA  then  too  little  real  time  attrition  was 
applied  and  the  subsequent  attrition  capability  of  the  target  should 
be  reduced. 

-  If  PKU  is  greuter  than  PKA  then  too  little  real  time  attrition  was 
applied  and  subsequent  attrition  capability  of  the  target  should  be 
increased. 

-  On  the  other  hand  if  PKU  is  equal  to  PKA  then  real  time  attrition 
was  just  right  and  no  adjustment  should  be  applied. 

Adjustment  Formulas.  The  concept  of  aliveness  analysis  was  originated 
at  CDEC  by  N.  Bryson.  The  largest  application  of  aliveness  analysis  to 
date  was  In  the  analysis  of  the  force-on-force  portion  of  8GT  York  Follow 


on  Evaluation  I  (FOE  I],  which  will  be  described  below.  Prior  to  tho 
start  of  FOE  I  the  authors  of  this  par*r  worked  tcveihb'  to  refine  the 
aliveness  methodology  and  produce  specific  formulas  fot  performing 
aliveness  analysis.  Aliveness  analysis  adjusts  for  differences  between 
real  time  and  post  tesc  attrition  rates  by  crediting  partial  kills  via 
"ootency"  or  "aliveness"  weightings  on  live  players  as  follows. 

Define  a  "potency"  or  "aliveness"  factor  A  for  etch  player,  where 

Ainitial'l  for  all  players.  Track  cumulative  credited  kills  by 

player  I  versus  player  J  as  K(I,J)  with  Kinitia.in.J)aO  for  all 

player  pairs.  Then  when  player  I  [potency  AoldfDI  engages  player  J 

[potency  AoldC'Hl  with  kill  probabilities  PKA  [actual,  irom 

post-test  analysis  or  revised  table)  and  PKIJ  [used  in  KTCA),  adjust 

potency  fretors  and  cumulative  credited  kills  as  follows! 

Knew(I.J)  »  Kold(I.J)  *  A0ldM )x[l-(l-PKA)Aold(I > 1 

Anew(I)  *  Ao.ld(I) 

AnewCJ)  •  Aold( J)xl (l-PKA)Aold(I) l/fl-PKU) 

Potency  of  the  target,  AnewC'H,  is  reduced  to  zero  for  any  engagement 
which  goes  to  real  time  assessment  and  results  in  a  dead  target. 

Formula  Motivation.  The  underlying  motivation  for  these  formulas  is 
straightforward.  First,  the  calculation  adjusts  potency  of  surviving 
players  as  a  ratio  of  surviva1  probabilities  [provided  the  flrei  has  A»l); 

-  If  a  player  survives  with  twice  the  probability  he  should  have  (for 
example,  if  Pk  “0.6  and  PKU*=0.2),  his  potency  is  halved. 

-  If  a  player  survives  with  half  the  probability  he  c.hould  have  (for 
example,  if  PKAB0.6  and  PKU-0.8),  his  potency  is  doubled. 

Second,  the  odd*looking  exponential  adjustment  for  the  potency  of  the 
firer  is  actually  based  on  a  standard  statistical  formula: 

-  n  firings  with  Pk»p  give  total  Pk*l-(l-p)n 

-  a  potency  n  player  firing  with  Pk*p  gives  total  Pk*i - C 1 - p )n. 

In  addition,  the  calculation 

-  reduces  to  sums  of  Pk's  when  PKA's  always  equal  PKU’s 

-  adjusts  in  the  right  direction  when  firer  potency  is  1,  and 

-  performs  well  in  practice,  as  the  rest  of  this  paper  shows. 

APPLICATION  OF  THE  ALIVENESS  CALCULATION 

Examining  Aliveness.  The  most  straightforward  way  to  examine  the 
aliveness  calculation  is  to  observe  how  it  performs  on  actual  sequences  of 
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engagements.  The  remainder  of  this  paper  describes  the  application  of 
aliveilesfi  methodology  to  score  casualties  in  the  field  trials  of  8GT  York 
FOB  I. 

Test  Description,  The  foi ce-on-force  portion  of  FOB  I  was  conducted 
at  the  U8  Army  Combat  Developments  Experimental ion  Center  CODEC).  Fort 
Hunter  Liggett.  California,  from  2  April  to  22  May,  1965.  It  was  a 
platoon  level  test  conducted  to  compare  capabilities  of  three  different 
air  defense  families  to  provide  protection  to  an  armor  battalion  task 
force  in  similar  types  of  missions.  The  three  air  defense  families  were 
nominally  called  "BGT  York."  "Baseline,"  and  "Alternate."  All  three 
families  had  five  Stinger  missile  systems  forward  and  two  Chaparral/FLIR 
missile  systems  with  the  battalion  trains.  What  distinguished  the 
families  was  the  large  air  defense  systems  deployed  forward: 

-  The  SOT  York  family  had  four  SOT  York  air  defense  gun  systems 
forward. 

•  The  Baseline  family  had  four  Vulcan  air  defense  gun  systems  forward. 

-  The  Alternate  family  had  two  Vulcan  air  defense  gun  systems  and 
two  Chaparral/FLIR  missile  systems  forward. 

Test  Players.  Overall,  there  were  typically  more  than  60  Blue  players 
and  more  than  30  Red  players  in  each  trial.  In  addition  to  Blue  air 
dofensc,  the  Blue  armor  task  force  consisted  of  roughly  26  Abrams  tanks, 

13  Bradley  fighting  vehicles,  and  20  other  Blue  ground  for cos  fM113’s, 
trucks,  etc.).  The  Red  air  attack  force  consisted  of  four  fixed  wing  (two 
Fitters  surrogated  by  A-7's  and  two  Frogfoots  surrogated  by  A-IO’s)  and 
four  rotor  wing  (four  Hinds  or  four  Havocs,  each  surrogated  by  AH-64's). 
Three  surrogate  Red  ECM  aircraft  (one  fixed  wing  and  two  rotor  wing 
stand-off  Jammers)  were  present  on  some  trials,  and  three  Blue  aircraft 
(one  AH-1S  rotor  wing  and  two  F-d  fixed  wing)  were  used  to  investigate 
pocsible  fratricide.  Finally,  a  small  Red  armor  force  (20  T-flO  tanks 
surrogated  by  K-CO's  and  8  BMP's  surrogated  by  M113's)  permitted  a  limited 
armor  battle. 

Test  Criteria.  The  main  mission  performance  criteria  for  FOE  I 
addressed  the  relative  proportion  or  Blue  Force  losses  to  Red  Air  during 
trials  when  the  three  different  families  of  air  defense  systems  were 
present.  That  is,  for  "similar"  trials  involving  each  family,  it  was 
necessary  to  estimate  Blue  Force  losses  to  Red  Air,  divide  by  Blue  Force 
size  to  estimate  the  proportion  lost,  and  then  form  appropriate  ratios  of 
the  proportions.  With 

Y  ■  Proportion  lost  in  S(!T  York  trials, 

B  ‘  Proportion  lost  in  Baseline  trials,  and 

A  “  Preportion  lost  in  Alternate  trials, 


the  required  ratio  for  comparing  SOT  York  versus  Baseline  was 
(B'Yl/B  ■  1  -  Y/B 

while  the  required  ratio  tor  comparing  Alternate  versus  Baseline  was 
(B-Al/B  ■  1  -  A/B. 

Over  70  trials  were  attempted  during  FOE  I,  and  h2  trials  (29  York, 

12  Baseline,  and  11  Alternate!  were  validated  fat  analysis.  Proportion 
lost  was  estimated  in  each  trial,  and  the  appropriate  ratios  ware 
estimated  in  an  analysis  of  variance  framework  in  order  to  adjust  for 
differences  in  trial  conditions  between  families.  The  criteria  reive*  t nd 
detailed  results  are  classified,  go  they  will  not  be  discussed  fully  in 
this  paper. 

FQK  Problems.  In  FOE  I,  there  were  frequent  differences  between  PKA 
and  PKU.  The  most  common  case  of  inequality  between  PKA  and  PKU  was  when 
PKA>PKU»0,  which  typically  occured  when  engagements  did  not  go  to  real 
time  assessment  (that  is,  no  firar-target  pairing  could  be  made  in  reel 
time!  but  engagement  conditions  (hence  PCA'sl  were  reconstructed  through 
post-test  analysis.  In  fact,  in  BQT  York  F02  I,  across  various 
firer- target  categories.  40  *o  SO  percent  of  engagement*  did  not  go  to 
real  time  assessment  (see  Figure  5),  but  PKA’s  ware  frequently  recovered 
through  post-test  analysis  (indicated  by  the  dotted  areas  in  Figure  5) 
However,  the  percent  recovered  was  substantially  different  for  different 
firer  categories  because  availability  of  post-test  data  sources  such  h$ 
video  and  audio  tapes  was  different  for  different  firer  categories.  In 
addition  to  the  cases  PKA'PKI!«0,  thare  were  many  cases  of  0<PK4<Pi01,  uhich 
occurred  because  the  pre-test  tabulations'  of  Pk’s  for  Blue  air  defense 
versus  Red  air  were  generally  too  high.  Figure  H  shows  as  an  example  the 
extent  to  which  PKA'*  differed  from  PKV’s  fcr  firings  by  selected  Blue  air 
defense  firers  against  selected  Red  air  targets.  For  these  cries,  PK4‘s 
equalled  PKU’s  less  then  a  third  of  vhe  time.  Almost  the  only  time  when 
PKA’s  equalled  PKU’s  was  when  both  were  sero  (dotted  portion  of  PKA"PKU 
bars  in  Figure  61. 

F.3E  I  Excise  lea.  The  following  three  examples  are  based  on  engagements 
in  FOE  I.  Ficticious  Pk  values  and  firer-target  rails  have  been  used  in 
order  to  koep  this  paper  unclassified.  However,  the  actual  engagement 
sequence*  led  to  essentially  the  same  olivenexs  calculations  given  here. 
These  examples  provide  convincing  evidence  that  the  allveness  calculation 
performs  well  in  practice. 

Example  1.  The  first  example  consists  of  two  engagements  against  a 
surrogate  Frogfoot  close  air  support  aircraft.  In  the  first  engagement, 
tlu  actual  survival  probability  (1  PKA»0.62)  was  2.14  times  what  was 
npplled  in  real  time  (1-PKU»0.29) .  That  is,  in  a  large  number  of  such 
engagements,  there  would  be  2.14  times  as  many  survivors  as  were  observed 
rsal  time.  The  eliveness  calculation  increases  the  potency  of  Frogfoot-1 
to  2.14  and  credits  0.38  kill  against  Frogfooc-l.  The  second  engagement 
did  not  go  to  reel  time  assessment  so  the  real  time  survival  probebility 


was  1.00.  8ince  the  actual  survival  probability  was  only  0.00,  the 
aliveness  calculation  decreased  tha  potency  to  1.41 "0.66x2. 14  and  credited 
0.73*2.14x0.34  kills  against  Frogfoot-1.  In  this  example,  1.11  kills  were 
credited  by  aliveness  while  tha  sum  of  Pk’s  was  only  0.72  and  no  simulated 
kills  ware  observed.  Tha  alivenass  result  makes  the  most  sense. 
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Example  2.  Tha  second  example  consists  of  a  series  of  engagements  by 
8QT  York-1.  The  first  engagement  against  Fitter-1  is  similar  to  those 
already  considered,  except  that  the  target  was  killed  real  time  so  that 
instead  of  increasing  to  1.61,  its  potency  was  reduced  to  zero.  In  the 
second  engagement,  however,  8UT  York-1  was  the  target  of  Hind-3. 

SOT  York-1  should  have  survived  with  probability  0.28,  but  survival 
probability  1.00  was  in  effect  applied  real  time  because  the  engagement 
did  not  go  to  assessment.  The  aliveness  calculation  decreased  the  potency 
to  0.28  and  credited  0.72  kill  against  SOT  York-1.  Then  in  the  third 
engagement,  SOT  York-1  with  potency  0.28  fired  at  F:ltter-3  with  potency 
1.94.  The  effective  actual  survival  probability  should  be  greater  than 
l-PKA-0.69  because  the  firer  was  only  "partially  alive"i  that  is,  if  this 
trial  were  repeated  many  times  with  perfect  RTCA,  SOT  York-1  would  only  be 
around  to  fire  a  small  fraction  of  the  time.  The  aliveness  formula  says 


that  the  effective  survival  probability  should  be  0.90*0.690.28,  and 
intuition  mggests  no  better  number.  Thus  the  new  potency  of  the  target 
increased  by  1.77*0.90/0.51  times  to  3.42,  and  0. 19*(l-0.90)xl .94  kill  was 
credited.  Once  a  firer  has  potency  less  than  1.00,  not  only  are  credited 
kills  reduced  but  also  potency  of  targets  tends  to  be  increased.  This  is 
illustrated  by  the  last  engagement  of  this  example,  where  even  though  PJCA 
and  PKU  were  the  same,  potency  of  the  target  increased  by  23*  to  1.23. 
Overall  in  this  example,  0.53  kills  were  credited  by  aliveness  against  Red 
while  the  sum  of  Pk*s  was  0.82,  and  there  was  one  simulated  kill.  This  is 
exactly  the  reverse  relationship  from  the  previous  example  where  credited 
kills  were  largest,  followed  by  sum  of  Pk’s  and  then  by  simulated  kills. 
Again,  the  aliveness  result  makes  the  most  sense. 


OLD 

OLD 

F.TRER 

FIR 

TARGET 

TGT 

ID 

PTCY 

ID 

PTCY 

PKA 

8GT  YORK-1 

1.00 

FITTER- 1 

1.00 

0.26 

HIND- 3 

1.00 

SOT  Y0RK-1 

1.00 

0.72 

SOT  YORK-1 

0.28 

FITTER-3 

1.94 

0.31 

SOT  YORK-1 

0.28 

HIND- 3 

1.00 

0.25 

8ummary 

of  Attrition  Estimatexi 

gimulatttd  Kill 

1 

Against  Red 

1.00 

Against  Blue 

0.00 

NEW 


PKU 

1-PKIJ 

TGT 

PTCY 

CRTD 

KILL 

SIM 

KILL 

0.54 

0.74 

0.46 

0.00 

0.37 

KILL 

0.00 

0.28 

1.00 

0.28 

0.09 

N/A 

0.49 

0.69 

0.51 

3.42 

0.06 

SlIRV 

0.25 

0.75 

0.75 

1.23 

0.05 

SURV 

Sum  n i  Pit '»  Cuditttd  KlUr, 

0.82  0.82 

0.72  0.72 


Example  3.  Tht  final  example  is  much  more  routine,  involving  a  flrer 
with  aliveness  one,  where  PKU’s  were  either  correct  or  were  zero  because 
the  engagement  did  not  go  to  real  time  assessment.  In  all  but  one 
instance,  the  credited  kill  was  equal  to  PKA,  and  all  three  measures  of 
attrition  nearly  agree. 
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OVERALL  IMPACT  OF  ALIVENESS  ANALYSIS  IN  FOE  I 

Trlttl-hvtrlttl  Summary.  Figure  7  displays  triai-by-trial  estimates 
for  the  proportion  of  blue  ground  lost  to  red  air  by  each  of  the  three 
methods.  It  shows  that  results  of  the  aliveness  calculation  tended  to 
fall  between  simulated  kills  and  results  obtained  by  sums  of  Pk's.  This 
occurs  because  the  most  common  RTCA  error  was  failing  to  go  to  real  time 


assessment  whim  a  Pk>0  should  have  bean  used,  which  produces  no  simulated 
kills  and  gives  potency  less  than  one  to  survivors.  Differences  between 
the  attrition  estimates  were  substantial  in  some  trials.  (The  apparent 
smoothness  of  the  aliveness  curve  in  Figure  7  compared  to  the  other  two 
curves  is  due  primarily  to  the  order  in  which  trials  were  sorted  for 
plotting. ) 

Analysis  of  Variance  Hesults  and  Overall  Conclusions.  For  engagements 
by  Red  air  against  Blue  ground  during  FOR  I,  the  tendency  for  sums  of  Pk’s 
to  ba  larger  than  credited  kills  from  aliveness  -  which  in  turn  tend  to  be 
larger  than  simulated  kills  -  carried  over  to  the  attrition  estimates 
obtained  from  analysis  of  variance  in  a  general  linear  models  framework. 
Sums  of  Pk’s  were  used  instead  of  aliveness  analysis  to  present  attrition 
estimates  to  decision  makers  because  sums  of  Pk’s  are  simpler  and  they 
gave  the  same  result  as  aliveness  for  the  crucial  SOT  York  family  *  the 
criteria  were  met..  Tn  retrospect,  this  agreement  appears  to  have  been 
luck.  Evan  though  the  direction  of  comparisons  between  attrition 
estimatts  based  on  sum*  of  Pk's,  aliveness,  and  simulated  kills  was 
consistant  across  air  defense  families,  the  relative  size  of  tha 
differences  between  estimates  was  not  consistent  across  families.  In  one 
cast  involving  Alternate  and  Vulcan  families,  a  crucial  estimate  based  on 
aliveness  was  less  than  half  that  obtained  from  sums  of  Pk’s  and  made  a 
difference  whether  or  not  an  important  criterion  was  met.  Results  from 
aliveness  analysis  can  be  substantially  different  from  analyses  based 
either  on  simulated  kills  from  RTCA  or  on  unmodified  sums  of  Pk’s.  Bince 
there  can  be  a  real  difference  between  results  of  the  techniques  unless 
RTCA  works  extremely  well,  a  preferred  technique  should  be  chosen.  Both 
simulated  kills  from  RTCA  and  unmodified  sums  of  Pk's  give  wrong  attrition 
estimates  when  PKA's  differ  from  PKU’s.  Thus  aliveness  analysis  should  be 
the  method  of  choice. 
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i.  introduction 


We  shall  be  concerned  with  the  problem  of  parameter  estimation  by 


means  of  the  likelihood  function  for  a  class  of  four-parameter 


distributions  (hyper-Gamma  class)  characterized  by  the  probability 


density  function  (pdf)  class 


f(x.P) 


p)={  br«" 

^  n .  x  < 


-P)*"1) 


S’P  exp  -  ^  t  =  (x-s)b' 1 ,  x>s, 


(1.0 


0 ,  x  <  s. 


The  parameter  vector  P  =  (s,b,p,8)  has  the  components  s  =  shift 


(location),  b  *  scale,  p  a  initial  shape,  8  « terminal  shape,  with  b  >  0,  p 


<  l,  8  >  0.  In  practice  we  are  given  a  set  of  absolute  frequency  data 


(xp,  fav)  (v  ■  l,...vm),  fa,  >  0,  fav  k  0  (v  ■  2 . m-l),  fam  >  0, 


*  N  « total  number  of  observations.  The  shift  parameter  s  , 
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therefore,  Is  restricted  to  s  <  X|. 

The  distribution  class  defined  by  (1.1)  is  of  wide  applicability.  As 
special  cases  It  contains  a  number  of  distributions  well-known  In 
statistics  and  statistical  physics:  Gauss  (p=0,  0=2),  Weibull  (pst-0<t), 
exponential  (p=  1-3=0),  Rayleigh  (p*l-3=-0,  Gamma  (p<l,  3=0, 
chi-square  (ps(2-v)/2,  3=l),  Maxwell  (p=  -2,  3=2),  Wien  (p=  -3,  3=0. 

Relative  to  the  most  essential  parameters  p  and  3  ,  the  class 
(l.O  covers  the  open  quadrant  p  <  I,  3  >  0  of  the  (p,3H>tane.  The 
locations  of  the  special  cases  just  mentioned  are  shown  In  the 
accompanying  figure. 

It  is  our  objective  to  show  that  the  class  (1.1)  is  easy  to  apply  In 
practice.  In  other  words,  we  shall  show  that,  for  a  given  set  of 

frequency  data  (xv,  fav)  (v  =  l . m),  the  four  parameters  s,  b,  p,  and  3 

can  easily  be  estimated.  The  likelihood  function  approach  will  be  used. 

2.  The  Likelihood  Function 

m 

Let  (xv,fav)  (v=l . m),  I  fav  =  N,  be  a  set  of  given  frequency  data. 

v*1 

We  set  log  (xv-s)  =  pv.  Then  the  likelihood  function  l(P)  for  the  pdf 


c lass  (1.1)  takes  the  form 


L(P)MNr‘N((l-p)r,)b'(1"p)N^  nyavPv^exp-^)"^  fave^.  I 

! 

! 

t 

I 

1 

Introducing  relative  frequencies  fv  =  N"1  fav  ,  we  obtain  for  R(P)  =  log  j 

i 

i 

L(P)  the  function 

R(P)  =  log  $  -  log  r  (( 1  -p)r 1 )  -  (l-p)  log  b  -  pC  -  b‘*B  (2.1) 

In  which  i 


m 

B  =  B(s,3)  =  £  f 
v=l 


m 

>  0  ,  C  =  C(s)  =  £  fp  Py 
Vs! 


The  objective  Is  to  maximize  the  function  R(P)  given  In  (2.1)  under 
the  constraints  s  <  xj,  b  >  0,  p  <  1,  0  >  o. 

The  partial  derivatives  of  R  with  respect  to  s,  b,  p.  and  0  (In  this 
order)  lead  to  the  equations 


pE  ♦  $b"*F  =  0  , 

(2.2) 

-(l-p)b’1  ♦  Pb'^’B  =  0  , 

(2.3) 

r’f  (O-P)r’)  ♦  log  b  -  C  =  0  , 

(2.4) 

♦  (l-p)J'2  ♦ftl-pr1)  *  b‘®B  log  P  -  p"0D  *  0  . 


(2,5) 


The  equations  (2.2) . (2.5)  are  the  four  likelihood  equations  in  the 


four  unknowns  s,  b,  p.  and  3,  relative  to  the  logarithmic  likelihood 


function  (2.1). 


We  make  the  following  essential  observations. 


First,  since  E  >  0,  3  >  0,  b  >  0,  F  >  0,  equation  (2.2)  shows  that,  if 


the  shift  parameter  s  is  considered  as  unknown. 


parameter  p  must  be,  less  than  zero, 


Secondly,  equation  (2.3)  shows  that  the  scale  parameter  b  can  be 


expressed  In  terms  of  the  parameters  s,  b,  and  3 


b*  =  (l-pr^B.  (3.1) 

Thirdly,  using  this  expression  for  b&  In  (2.4)  and  (2.5),  we  see  that 


the  inltict  shape  parameter  p  can  be  eliminated  since  it  can  be 


expressed  In  terms  of  s  and  3  . 


(l-pr'p  -  AE-1  ,  p  =  1  -0A'>B  , 


(3.2) 


where 


A  =  /.(s,3)  s  3(0-BC). 


Consequently,  out  of  the  set  of  the  four  equations  (2.2 . (2.5),  we 


need  retain  only  two,  namely  (2.2)  and  (2.4).  Eliminating  from  these  b 


and  p  by  means  of  (3.1)  and  (3.2)  we  arrive  at  the  two  equations 
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g(e,J)  =  ♦(A-'e)  *  log  A  -  (>C  *  0  ,  (3.3) 

h(s,|l)  -  ((T'a-BJE  ♦  F  -  0.  (3.1) 

The  solution  (si)  of  these  equations  and  the  corresponding  numbers 


P  *  1  -  I  A1  (si)  B(si)  , 

6  -  exp^-1  log  [flB  (si)  (l-p)"1]} 

obtained  from  the  auxiliary  formulas  (3.2)  and  (3.1)  give  us  the  desired 
estimates  for  the  four  parameters  relative  to  a  given  set  of  frequency 

data  (xv,  fv). 

For  the  numerical  solution  of  the  system  of  equations  (3.3)  and  (3.4), 
it  is  convenient  to  Introduce  the  functions  X,  S,  and  P,  defined  by 

A  =  Je*PmX  ,  B  =  .,Pm&  .  . 

Examples  for  the  solution  of  equations  (3.3)  and  (3.4)  will  be  given  In 
the  paper  by  Mr.  H.  P.  Oudel. 
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ABSTRACT.  Many  authors  of  texts  and  articles  on  factor  analysis 
recommend  rotation  of  factors  after  original  solutions  of  unrotated 
factors.  This  goal  1e  readily  achieved  today  with  the  aid  of  'canned 
programs'  on  most  of  the  bigger  computer  systems.  It  was  noticed, 
however,  that  for  a  system  of  factor  analysis  of  atmospheric  parameters 
orthogonal  rotation  was  significantly  different  for  only  some  methods  of 
estimating  commonalities.  Furthermore,  oblique  rotation  differed  little 
from  orthogonal  rotation. 

It  will  be  shown  that  in  cases  where  the  alignment  of  data  in  a 
(rectangular)  system  ore  close  to  the  abscissa  and  ordinate  of  the  system 
rotation  does  not  contribute  much  to  further  alignment.  Whenever  the 
originol  factors  dlsploy  scatter  in  the  diagrams  the  dispersion  is  already 
reduced  by  on  orthogonal  rotation.  Thus  oblique  rotation  would  not  bring 
much  improvement. 

It  will  oe  discussed  that  the  'simplification'  of  factors  by  rotation 
will  aid  in  the  diagnosis  of  the  system  but  does  not  Improve  the  task  of 
prediction  from  the  system, 

1.  INTRODUCTION.  With  the  cval  I  ability  of  'canned  programs'  for 
factor  analysis  the  mathematical  difficulties  have  largely  been  resolved 
although  one  should  carefully  consider  the  mathematical  background  on 
which  these  'canned  programs'  ere  based.  After  calculation  of  the 
unrotated  factors  many  authors  (a.g.,  Cattel  1952,  1965)  recommend 
simplification  by  rotation  of  the  systems.  The  concept  of  rototlon  Is  not 
supported  by  some  authors  dealing  with  meteorological  data.  In  fact,  Buell 
(1971)  finds  It  completely  unnecessary. 

In  a  previous  stu,  ,1s  author  ( 1 986a)  deduced  that  rotation  resulted 
in  an  alignment  of  factors  although  the  factors  were  obtained  by  different 
methods  of  estimating  the  'communellties*.  Thus  rotation  in  factor 
analysis  of  climatological  date  seems  to  serve  a  useful  purpose,  reducing 
the  Individuality  and  subjectivity  In  the  decision  of  estimating  the 
communallties.  It  was  discovered,  however,  that  little  difference 
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botwoon  orthogonal  ond  oollquo  rotation  of  climatological  factor  data 
showed  up.  Thus  the  author  decided  to  study  rotation  In  factor  analysis  of 
climatological  date  In  more  detail. 

The  analysis  disclosed  that  whenever  the  original  factor  analysis 
displayed  little  scatter  of  the  factor  components  plotted  Into  a 
rectangular  coordinate  system  the  rotation  did  not  render  significantly 
different  resulte.  Factor  components  with  larger  dispersion,  however, 
provide  better  alignment  of  data  along  the  axes  and  provide  less  scattering 
after  rotation.  In  the  given  examples  from  climatological  data  of 
Stuttgart,  Germany  orthogonal  rotation  diminished  the  scatter  already  to  a 
point  where  oblique  rotation  could  not  contribute  to  a  further  reduction, 

It  can  be  shown  that  rotation  may  simplify  the  factors  and  decrease 
the  scatter  but  does  not  contribute  to  Improve  the  ability  of  prediction 
utilizing  the  factor  analysis.  The  prediction  error  remains  the  same, 
whether  rotated  or  unrotated. 

2.  FACTOR  MODEL  AND  ESTIMATION.  The  factor  model  Is  based  on: 
MxshAMF>rif  (i) 

where  M*  Is  a  data  matrix  (symmetric),  flA  a  coefficient  matrix  and  nF 

a  factor  matrix.  In  the  principal  components  analysis  with  tne  number 
of  factors  corresponding  to  the  dimension  of  the  data  matrix  M{  is  an 

error  matrix.  ma  is  also  called  the  factor  loading  matrix  or  factor 

pattern.  For  diagnostic  purposes  Up  Is  not  calculated  in  most  cases. 

The  mathematical  solution  of  eqn.  (1)  can  be  formulated: 

M*  =  Ma  |  MTa  *  (  y  )  (2) 

which  Is  an  eigenvector  problem.  $  is  a  factor  covariance  matrix.  <J>  - 
MTpfip.  ^  is  a  diagonal  matrix  if  the  errors  and  the  factors  are 

uncorrelated.  This  Is  generally  assumed.  In  Its  standard  form  Is  a 

correlation  matrix  with  unity  in  the  diagonal  (communal it les).  In  this 
form  the  factors  are  called  principal  components.  In  the  true  factor 
analysis  the  assumption  is  made  that  not  all  factors  are  known.  Thus 
the  diagonal  element  is  <  1.0.  Several  substitutions  have  been  suggested 


for  trie  commonalities  (e.g.  Guttman,  1956,  or  see  Essenw anger,  1976,  p. 
281).  In  recent  times  several  estimation  methods  for  the  communalities 
as  described  by  Joreskog  (1967)  have  been  developed  based  on  statistical 
principles. 

The  unweighted  least  squares  (ULSQ)  method  requires  that  U  Is  a 
minimum  for: 

U  =  ( !  /2)  tr  (ns  -  nx)2  (3) 

where  Ms  is  the  correlation  matrix  with  estimates  In  the  diagonal  and  tr 
means  the  trace. 

The  generalized  least  squares  (GLSQ)  method  minimizes  G  for: 

g  =  (;/;>)  (Vr(V1'ix}2  (*) 

and  the  maximum  likelihood  (MXL1)  method  n  for; 

M  =  tr  [(nx* 1  Hs)]  -  in  I  (nx* ' 1  «s)  I  -  n  (5) 

lr  addition  to  the  three  methods  described  above  a  truncation  in  the 
number  of  factors  obtained  from  the  principal  components  analysis  could 
also  be  used  (see  fcssenwanger,  1986a,  b).  Since  this  method  maximizes 
the  representation  of  the  variance,  the  same  number  of  factors  as  In  the 
other  three  methods  wili  have  a  higher  percentage  of  representation  of 
the  variance. 

Rotation  serves  to  simplify  the  factors  (see  Essenwanger,  1976  p. 
285.  or  ;986a).  Rotation  can  be  accomplished  by  a  transformation 
matrix  V  such  as: 

Mf  -  |  (6) 

where  Mp  Is  the  rotated  matrix  by  orthogonal  rotation,  in  various  cases 

simplification  Is  not  sufficiently  achieved  by  an  orthogonal  rotation  and 
an  oblique  rotation  Is  appropriate.  In  this  case  two  matrices  must  be 
obtained: 


3  MAT2 


(7a) 
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where  Me  is  named  factor  structure  and  Me  factor  pattern  matrix.  The 
”s  rp 

structure  matrix  Me  represents  the  covariances  (correlations)  between 

s 

factors  and  variables  and  Mp  can  be  interpreted  as  regression 

coefficients.  In  the  orthogonal  rotation  the  factors  remain  uncorrelated 
while  in  the  oblique  case  the  factors  are  correlated. 


3.  EXAMPLE  OF  ORTHOGONAL  ROTATION.  A  simple  example  Is 
illustrated  for  an  orthogonal  rotation  of  factors.  A  principal  components 
analysis  was  performed  for  the  observed  data  of  Frankfurt  during 
January  1946-1956.  The  correlation  matrix  between  four  elements 
(visibility.  I.e.  logarithm  of  visibility,  temperature,  and  wlndspeed)  Is 
given  In  Table  I.  Table  2  exhibits  the  four  factors  from  this  principal 
components  analysis.  The  (orthogonally)  rotated  factors  are  shown  In 
Table  3,  Figure  I  depicts  one  particular  phase  of  this  rotation  between 
(modified)  factor  l  and  4.  In  this  case  the  rotation  angle  calculated  by 
the  VAR  I  MAX  method  (see  Kaiser,  1958,  or  Cattell  and  Khanna,  1977) 
was  -39°.  The  figure  Illustrates  that  the  four  points  are  much  closer  to 
the  axes  after  rotation  of  the  coordinate  system.  Since  the  rotation  Is 
orthogonal  the  other  factors  are  not  affected. 


4.  EXAMPLES  OF  ROTATION  AND  COMPARISON  QF  ESTIMATION  METHODS. 


While  comparing  estimation  methods  for  communallties  in  factor 
analysis  (Essenwanger,  1986  a,  b, )  It  was  noticed  that  oblique  rotation 
and  orthogonal  rotation  did  not  differ  much  for  the  Stuttgart,  Germany 
climatological  data  samples.  A  typical  example  Is  exhibited  here  In 
Tables  4  and  5.  For  better  readability  values  <  0.4  were  omitted.  For 
the  orthogonal  rotation  the  factor  loads  and  for  the  oblique  rotation  the 
structure  matrix  is  shown.  It  is  apparent  that  orthogonal  and  oblique 
rotation  differ  very  little.  The  rotation  procedure,  demonstrated  In 
section  three,  Is  depicted  in  Figures  2  and  3.  They  provide  an  example 
from  a  truncated  principal  components  analysis  for  the  January 
1946-1953  data  at  Stuttgart,  Germany.  Nine  climatological  elements 
(celling,  cloud  amount,  visibility,  I.e,  Its  logarithm,  wind  direction  and 
speed,  temperature,  dewpoint,  relative  humidity,  and  pressure)  were 
chosen  for  the  factor  analysis.  Four  factors  were  retained. 


Figure  3  illustrates  the  reduction  of  the  scatter  by  orthogonal 
rotation.  In  the  graph  the  components  of  one  factor  were  used  as  the 
jbscissa  and  the  components  of  the  second  as  the  ordinate,  we  may 
issume  that  further  rotation  for  simplifications  is  unnecessary  if  the 
components  pair  for  the  two  factors  fail  witnin  a  distance  of  ±  0.2  from 
the  axes.  For  the  four  factors  (54  points  for  9  elements)  20  points 
remain  outside  this  band  in  the  unrotated  factors  case  (Figure  2).  After 
orthogonal  rotation  only  four  data  points  remain  outside  the  band 
(Figure  3).  This  leaves  little  room  for  improvement  by  an  oblique 
rotation. 

The  problem  of  rotation  was  further  analysed  for  12  factor  analysis 
results  although  only  for  one  station:  Stuttgart,  Germany.  Table  6 
discloses  the  count  of  data  points  outside  the  postulated  acceptance 
band  ♦  0.2  around  the  axes.  The  left  hand  part  shows  the  counts  for  the 
unrotated  and  the  right  hand  part  the  counts  after  an  orthogonal 
transformation  of  axes  was  performed. 

Although  this  count  should  not  be  used  as  the  only  source  for 
evaluation  and  Interpretation  of  the  merits  of  an  oblique  rotation  it 
discloses  some  interesting  facts,  however.  Apparently  the  principal 
components  analysis  and  the  unweighted  least  squares  estimations  show 
the  greatest  dispersion  of  the  54  component  points  for  the  unrotated 
factors.  After  rotation  only  a  few  points  remain  outside  the  “desirable'' 
bounds.  A  closer  scrutiny  reveals  that  oblique  rotation  may  provide  an 
improvement  through  the  alteration  of  axes  only  In  a  few  cases  (e.g. 
winter,  I2h)  where  the  orthogonal  rotation  left  between  to  to  >4  points 
outside  thej:0.2  band.  Thus  the  improvement  in  simplification  may  be 
judged  by  a  count  of  the  number  of  points  falling  outside  the  ♦  0.2 
tolerance  band.  In  addition,  the  distance  from  the  origin  (magnitude  of 
vectors)  can  be  Included  into  the  judgement  criteria. 
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The  closeness  between  orthogonal  and  oblique  rotation  can  also  be  judged 
by  a  comparison  of  the  transition  matrices  Tj  and  T2.  For  the  data  of 

Table  4  and  the  principal  components  analysis  those  two  matrices  are 
given  in  Table  7.  A  close  inspection  reveals  that  the  corresponding 
numerical  values  differ  very  little  between  T|  and  T2.  In  addition  to  the 

transition  matrices  the  correlation  between  factors  can  be  examined,  in 
the  orthogonal  case  the  factors  are  uncorrelated  and  the  numerical  value 
Is  zero.  The  factor  correlation  matrix  in  Table  7  displays  that  the 
correlation  coefficient,  although  not  precisely  zero,  is  extremely  low. 
fact,  the  deviation  from  zero  do  not  hold  up  under  the  scrutiny  of  a 
statistical  significance  test  at  the  95*  level  of  significance.  In  this 
case  oblique  rotation  would  not  be  necessary. 


in 


5.  PREDICTION  FROM  FACTOR  ANALYSIS-  It  was  Illustrated  in  the 
previous  sections  that  for  climatological  data  oblique  rotation  would  not 
add  to  simplification  and  diagnosis  In  factor  analysis  beyond  the 
achievements  by  orthogonal  rotation.  One  question  remained:  would 
oblique  rotation  improve  the  prediction  based  on  factor  analysis?  From 
a  theoretical  point  of  view  rotation  would  neither  improve  nor  diminish 
the  results  for  prediction.  This  expectation  is  confirmed  by  the  data 
presented  In  Table  8  as  follows. 


A  factor  analysis  was  performed  as  a  pilot  for  a  sample  of  15 
observation  data  sets  randomly  chosen  from  the  winter  months  December 
1946  -  February  1948  at  Stuttgart.  Although  the  sample  Is  small  It 
reveals  the  essential  facts.  The  nine  elements  were  chosen  as 
previously  used.  For  every  element  the  prediction  was  based  on  four 
factors  whose  components  were  calculated  for  the  15  observations. 

Then  the  differences  e2  s  £  (*'Xp)2/'N  were  calculated  for  every 

element  (x  =  observed,  xp  =  predicted).  The  result  for  the  unrotated  case 

and  the  oblique  rotation  Is  found  in  Table  8.  As  expected  the  two  error 
columns  t 2  are  identical  except  for  one  difference  by  rounding.  This 
result  Implies  that  the  goodness  of  fit  for  prediction  depends  only  on  the 
number  of  factors  and  Is  Independent  of  rotation.  In  a  previous  article 
(Essenwanger,  1986b)  it  was  pointed  out,  however,  that  the  (truncated) 
principal  components  analysis  provided  the  highest  percentage 
approximation  of  the  total  variance.  Thus  the  quality  of  prediction 
would  depend  on  the  estimation  method  for  the  communal ities.  In  this 
article  it  was  also  demonstrated  that  the  dissimilarity  of  the  factors 
obtained  by  differences  In  estimating  the  communalities  virtually 
vanishes  for  climatological  data  after  othogonal  (and  oblique)  rotation. 
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Thus  the  simplification  achieved  by  rotation  leads  to  the  same 
"climatological  factors." 


6.  conclusion  and  SUMMARY:  Rotation  In  factor  analysis  was  studied 
in  detail  for  climatological  data  samples.  Although  the  study  was 
limited  to  one  station  (Stuttgart,  Germany)  the  result  may  Indicate  that 
orthogonal  rotation  of  factors  for  climatological  data  may  be  sufficient 
to  achieve  simplification.  Unless  simplification  is  desirable  in  cases 
where  the  factor  analysis  is  utilized  as  a  prediction  tool  the  percentage 
approximation  of  the  total  variance  is  not  Improved  by  rotation. 
However,  rotation  serves  in  aligning  the  original  dissimilar  factors  to  a 
uniform  system  of  factors  in  terms  of  climatology  although  the 
individual  estimators  for  the  communal  (ties  differ. 
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Table  l.  correlation  Matrix  for  Four  Meteorological  Elements,  Frankfurt, 
Germany,  1946-1956  (Matrix  Mx) 


ln(v‘j  1.0  0.06  0.08  0.42 

P  0.08  1.0  0.05  -0.16 

T  0.08  0.05  1.0  0.20 

W  [.0.42  -0.16  0.20  1.0 

V  =  Visibility,  P  *  pressure,  T  =  temperature,  W  =  Windspeed 


$ 


Table  2.  Factor  Matrix  MA  for  Correlation  Matrix  of  Table  1 
(Principal  Components  Factors) 


‘0.76 

0.18 

-0.45 

-0.10 

0.95 

-0.19 

0.44 

0.31 

0.83 

0.85 

-0.21 

-0.04 

1.50 

1.07 

0.93 

-0.42 

0.24 

-0.15 

0.49 

0.50  (Eigenvalue) 


Table  3.  Rotated  Factor  Matrix  (Orthogonal  Rotation) 


.96 

.22 

-.07 

.17 

-.12 

.98 

.06 

-.05 

.14 

.02 

.99 

.08 

.36 

-.07 

.08 

.95 

1.04 

1.03 

.99 

.94 

HHHi 


Table  4. 


ORTHOGONAL  (ORT)  AND  OBLIQUE  (OB)  ROTATION  (STUTTGART, 
JANUARY  1947-1953)  UNIT  0.01 


1 

CEIL 

2 

CL. AMT 

3 

Ln  VIS 

4 

WD 

5 

WS 

6 

TEMP 

7 

OEWP 

8 

REHU 

9 

PRES 

VAR 

PROD.  VAR 


ORT  OB 


79  81 

62  66 
73  76 


73  69 


PRINCIPAL  COMPONENTS 
I  ORT  OB  I  ORT 


ORT  OB 


1  CEIL 

85  87 

2  CL. AMT 

91  92 

3  Ln  VIS 

76  78 

4  WO 

52  55 

5  WS 

62  64 

6  TEMP 

86  88 

7  DEWP 

100  100 

8  REHU 

58  56 

9  PRES 

99  98 

171  173 


172  171 


204  202 
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Table  5. 


ORTHOGONAL  AND  OBLIQUE  ROTATION  (STUTTGART 
JULY  1947-1953)  UNIT  0.01 


1  CEIL 

2  CL. AMT 

3  Ln  VIS 

4  WO 

5  WS 

6  TEMP 

7  DEWP 

8  REHU 

9  PRES 


1 

CEIL 

2 

CL. AMT 

3 

Ln  VIS 

4 

WD 

5 

WS 

6 

TEMP 

7 

DEWP 

8 

REHU 

9 

PRES 

PROD.  VAR 


ORT 

OB 

ORT 

OB 

ORT 

OB 

80 

82 

77 

80 

69 

69 

63 

61 

41 

68 

64 

69 

68 

68 

75 

93 

92 

87 

86 

208 

202 

213 

215 

143 

147 

UNWEIGHTED  LEAST  SQUARES 

ORT 

OB 

1  ORT 

OB 

!  ORT 

OB 

74  7 

91  9 


80  83 

98  98 


187  185 


Table  6.  Number  of  D«ta  Points  Outside  Tolerance  Band  (Stuttgart,  1946-1953) 


*35 


UNROTATEO 

ORTH.  ROTATION 

SAMPLE  DATA 

PC 

ULSO 

GLSq 

ML 

PC 

ULSQ 

GLSQ 

ML 

,1  AND  ARY 

20 

25 

27 

12 

4 

7 

6 

8 

APRIL 

27 

34 

14 

2 

8 

5 

5 

1 

JULY 

21 

25 

6 

7 

5 

2 

3 

3 

OCTOBER 

21 

21 

6 

9 

2 

1 

6 

2 

SUMMER  OOh 

13 

26 

4 

8 

9 

5 

3 

2 

SUMMER  06h 

21 

20 

5 

6 

7 

3 

3 

2 

SUMMER  02h 

13 

17 

5 

12 

3 

8 

4 

10 

SUMMER  18h 

16 

23 

13 

6 

5 

6 

13 

6 

WINTER  OOh 

23 

24 

24 

14 

9 

6 

3 

6 

WINTER  06h 

22 

17 

22 

13 

8 

9 

6 

8 

WINTER  12h 

24 

22 

28 

10 

13 

11 

14 

10 

WINTER  18h 

22 

25 

14 

10 

11 

5 

5 

6 

E 

243 

279 

168 

109 

84 

66 

71 

64 

Mean 

20.2 

23.2 

16.0 

9.1 

7.0 

5.5 

5.9 

5.3 

PC  =  Principal  Components  Analysis,  ULSQ “Unweighted  Least  Squares  Estimators, 
GLSQ  =  Generalized  Least  Squares  Estimator,  ML  =  Maximum  Likelihood  Estimators 
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Table  7. 


STUTTGART,  JULY  (1946-1953) 
TRANSFORMATION  MATRICES 
(PRINC.  COMP.  METHOD) 


ORTHOGONAL 

OBLIQUE 

T1 

T2 

0.67 

-0.68 

-0.29 

0.05 

0.65 

-0.74  -0.34 

0.08 

0.66 

0.71 

-0.17 

-0.18 

0.68 

0.65  -0.19 

-0.21 

0.34 

-0.06 

0.94 

0.10 

0.34 

-0.10  0.89 

0.11 

0.05 

0.17 

-0.11 

0.98 

0.04 

0.15  -0.10 

0.97 

i? w*. 


Table  8.  FACTORS  AS  PREDICTORS 


UNROTATED 

OBLIQUE 

ROT. 

Mean 

0 

2 

e 

4? 

2 

e 

CEILING 

9 . 2  *1 03  ft 

8.2«103 

5.38 

2.31 

5.38 

CLOUD  AMT 

0.625 

.44 

0.009 

0.10 

.009 

Ln  VISIBIL. 

0.8 

0.95 

.085 

0.29 

.086 

WIND  DIR 

276° 

90° 

24.33 

4.93 

24.33 

WIND  SP 

4.0  kt 

3.1  kt 

2.22 

1.50 

2.22 

TEMP 

24.0°F 

0.6°F 

6.29 

2.51 

6.29 

DEWP 

21.0°F 

9.5°F 

2.01 

1.41 

2.01 

REL.  HUM. 

89.  OX 

10.3% 

30.48 

5.52 

30.48 

PRESSURE 

1021.0  mb 

8.1  mb 

1.15 

1.07 

1.15 

50.64 

50.64 

c2  -  2(x-xp)2/N 

UNITS  IN  LAST  TWO  COLUMNS  ARE  THE  SAME  AS  IN  THE  FIRST  COLUMN. 
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An  Kxact  Method  for  One-Sided  Tolerance  Limits 
in  the  Prune*  of  Batch-to-Batch  Variation 

Mark  Vaagal 

U.  S.  Amy  Matarials  Technology  Laboratory 
Watertown,  Maaaacbuaatts  02172*0001 

Abstract 

Mas  and  Owan  (1983)  proposed  an  improvamant  on  a  method  of  Lamon  (1977) 
for  astimating  tolaranca  limits  from  a  balanced  one-way  ANOVA  random  effects 
model.  This  method  uses  an  approximation  of  Sattarthwaita  (1946)  to  replace  a 
linear  combination  of  two  chi-square  random  variables  with  a  random  variable 
having  a  chi-square  distribution.  The  tolerance  factor  is  then  estimated  as  a 
quantile  of  a  noncantral  t-distribution.  The  Mae-Owen  procadure  is  conserva¬ 
tive  for  all  values  of  the  population  variance  ratio. 

An  alternative  approach  is  to  view  the  tolerance  limit  problem  as  a 
variant  of  the  Behrana -Fisher  problem.  The  work  of  Welch  (1947)  and  Trickett 
and  Welch  (1954)  may  then  be  applied  to  derive  an  integral  equation  the 
solution  of  which,  a  function  of  the  ratio  of  the  between  batch  to  the  within 
batch  mean  squares,  provides  sn  exact  solution  to  the  problem. 

An  algorithm  is  presented  for  iteratively  approximating  this  function. 
Neither  the  existence  of  a  solution  nor  the  convergence  of  this  algorithm  are 
discussed!  but  numerical  evidenca  la  presented  which  suggests  that  the  pro¬ 
posed  solution  is,  for  the  purposes  of  applied  statistics,  exact  for  all 
values  of  the  ratio  of  between  batch  to  within  batch  population  variances. 

Two  other  topics  considered  in  this  paper  are  an  approximation  to  the 
tolerance  limit  based  on  the  Welch-Aspin  series  solution  to  the  Behrens-Fisher 
problem  and  a  discussion  of  the  effect  pooling  and  using  s  single  sample 
procedure  has  on  the  coverage  probability  of  the  tolerance  limit. 

An  application  to  determining  (.90,  .95)  lower  tolerance  limits  for 
composite  material  strength  data  in  the  presence  of  batch-to-batch  variation 
is  discussed.  This  tolerance  limit  is  referred  to  as  the  'B-basis  material 
property*  by  aircraft  designers  and  is  used  to  determine  the  scceptability  of 
a  composite  material  for  aircraft  applications. 


1.  Introduction 


If  a  material  la  manufactured  in  many  large  batches  and  the  population  of 
intereat  conaiata  of  all  batches,  the  random  effecta  model  may  be  an  appropri¬ 
ate  model  for  meaauremanta  made  on  character iatica  of  the  material. 

Let  denote  the  Jth  of  J  observations  from  the  ith  of  I  batches.  If 
follows  a  one-way  balanced  random-effects  model,  then 


(1.1) 


'u 


where  y  denotes  the  population  mean,  y  denotes  the  mean  of  the  ith  batch, 
and  e.  .  is  the  error  term.  The  b  'a  and  the  e..'s  are  assumed  to  be  indepen- 

1  t  i 

dently  distributed  normal  with  mean  aero  and  variance  and  o  respectively. 
An  observation  X  from  this  population  is  thus  normally  distributed  with  mean  u 
and  variance 


(1.2) 


•  a?  +  o! 


This  paper  presents  techniques  for  determining  one-sided 
tolerance  limits  for  X  based  on  a  random  sample  of  J  items  from  each  of  I 
batches.  A  (9,  y)  lower  tolerance  limit  is  a  random  variable  T  such  that  a 
proportion  3  of  the  population  is  covered  by  the  interval  (-•»,  T)  with  proba¬ 
bility  y*  The  methods  developed  here  for  lower  tolerance  limits  may  be 
adapted  in  an  obvious  way  to  upper  limits. 

An  important  industrial  application  of  tolerance  limits  ia  to  the  charac¬ 
terization  and  certification  of  structural  materials  for  aircraft.  In  order 
to  determine  the  acceptability  of  material  for  aircraft  applications,  design¬ 
ers  use  'material  basis  properties'  which  are  tolerance  limits  on  the  strength 
of  a  material  as  determined  from  experimental  failure  data.  A  (.90,  .95) 
lower  tolerance  limit  is  called  a  'B-basis'  value  or  'B-alowable' .  The  more 
stringent  (.99,  .95)  limit  in  referred  to  as  an  'A-basis*  value  or 
'A-allowable' . 

There  is  increesing  interest  in  the  use  of  composite  materials  as  light¬ 
weight  alternatives  to  metals  for  aircraft  applications.  Composite  material 
properties  typically  exhibit  far  more  batch-to-batch  variability  than  do 


78 


V.1 

W 

■M, 


met* It |  consequently  there  is  e  growing  need  for  methods  for  determining  one 
sided  tolerence  limits  in  the  presence  of  batch-to-batch  verietion. 

A  modification  of  a  procedure  of  Lemon  (1977)  and  Me*  and  Owen  (1983)  has 
therefore  been  adopted  for  this  application  by  Neal  et.  al.  (1987)  and  will  bo 
included  in  Mi 1-handbook* 17  (1987),  a  handbook  for  the  use  of  composites  in 
aeroapase  applications.  It  is  hoped  that  the  virtually  exact  method  to  be 
discussed  in  Section  6  will  eventually  supersede  the  Mea-Owen  procedure  for 
this  application. 

2.  The  Nee-Oven  Procedure 

let  n  •*  IJ  denote  the  sample  sice.  The  parameters  y,  and  of  the 
random  effects  model  may  be  estimated  by  the  pooled  mean  y,  the  within  batch 
mean  square  MS#  and  a  linear  combination  of  MS#  with  the  between  batch  mean 
square  MSfe  where » 


(2.1) 


I  J 


-  I  I  X./n, 
i-1  j-1 


(2.2) 


MS. 


I  *  >2 

J  I  (u  -  x), 

i-i  1 


-  I  X./J, 
j-1  13 


I  J  -  , 

(2.3)  MS  -  I  I  (X.  -  y)/n  . 
*  i-1  j-1  iJ 


An  unbiased  estimator  of  the  population  variance  is 


(2.4)  oA  ■  MS. /J  +  (1  -  1/J)  MS  . 

ad  e 


For  0  <  8  <  1,  let  Ka  be  the  0  quantile  of  the  standard  normal  dlstribu- 

p 


tion,  i.e 


(2.5) 


0  -  1//2* 


r 

J  n 


A  (0,  y )  lower  tolerance  limit  is  a  100tX  lower  confidence  bound  on 


By  analogy  with  tha  aingla  aampla  eaaa  (aaa,  for  axampla,  Owan  (1968)), 
ona  aaaka  an  aatimator  of  tha  form 


m. 


(2.7 j  u  -  ko. 


whara  k  ia  choaan  to  aatiafy 


(2.8)  P(y  -  kox  i  V  -  K^ox)  ■  y  • 


Slnca  u  ia  diatributad  normal  with  main  u  and  varianca 


(2.9)  o’  -  (Jo’  +  o*)/n 

M  D  • 


ona  may  rawrita  (2.8)  aa 


(2.10)  P((Z  +  /nKpB)/(ox/ox)  S  /nkB)  ■  Y 


whara 


(2.11)  Z  -  (w  -  p)/o  , 

(2.12)  B  -  ((JR  +  1 )/ (R  +  1 ))*  , 


ir 

$ 

a': 


I 

m 

SM. 


m 


(2.13)  R  -  o’  /o’  . 


Tha  random  variabla  o’  ia  approximataly  diatributad  aa  the  ratio 
of  a  chl-aquara  to  ita  dagraaa  of  fraadom,  whara  tha  dagraaa  of 
fraadom  ara  givan  by  (3attarthwaita  1946)  i 


(2.14) 


(R  +  1) 

(R  +  1/J)  +  (1  -  1/J) 


I  -  1 


w 

I 


i 


If  Tj1  (y,  6)  denotea  the  invaraa  of  tha  noncantral 

t-diatribution  with  f  dagraaa  of  fraadom  and  noncantrality  paraaatar  6,  than 

(2.15)  k  ■  ZnBKpWnB) 

Unfortunately,  tha  tolaranca  limit  factor  k  dapanda  on  tha  nuiaanca  paramatar 
R.  Maa  and  Ovan  suggaat  raplacing  R  with 

(2.16)  R  i  ( (MS. /MS  )F  -  1)/J 

n  can 

whara  ia  tha  100n  parcantila  of  an  F  random  variabla  with  dagraaa  of 
fraadom  I(J-l)  and  1-1.  R^  ia  a  100n%  uppar  confidanca  bound  aatimata  on  R 
(Saarla,  1971,  p.414)  and  tha  confidanca  coefficient  may  ba  datarminad  by 
numarical  intagration  ao  that 

*  A 

(2.17)  P(y  -  k(Rr)ox  i  y  •  Kpox)  2  y 

for  all  I,  J  and  R.  Thaaa  valuaa  ara  raproducad  from  Maa  and  Ovan  (1983)  for 
varioua  combinationa  of  £  and  y  in  Tabla  1. 

For  tha  caaa  of  0  ■  .90  and  y  M  *95  aoma  of  tha  conaarvatiam  inharant  in 
tha  abova  valuaa  haa  baan  ramovad  by  allowing  n  to  vary  with  I  and  J.  The 
raault  of  thia  numarical  work  ia  praaentad  in  Tabla  2. 

3.  An  Exact  Solution  for  Known  R 

If  R  la  known,  tha  tolaranca  limit  factor  k  ia  the  appropriate  quantile  of 
tha  diatribution  of 

Z  +  6 

(3.1)  A  -  - r 

(C;Yi+  CjYj)* 

where  Z  haa  a  atandard  normal  diatributloni  Y,,  ia  dlatributad  aa  a  chi-aquare 
with  nt  dagraaa  of  fraadom  for  i-1,  2|  and  Cx,  Ca  and  6  are  conatanta  with  Cx 
and  C2  poaitiva.  Once  thia  diatribution  haa  baan  datarminad  the 


tolerance  limit  may  be  obtained  exactly. 

The  denaity  of  the  linear  combination  Y  *  C;Yi+CaYa  la 
•how  in  Fleiaa  (1971)  to  be 


fY(y> 


r((nx+  na)/2) 
r(ni/2)  r(n,/2) 


(3.3) 

fl  nx/2-l 
I  x  (1 
Jn 


na/2-l 

*>  *i1+n<!y/(c‘x  +c>(1 


-  x) ) )dx 


where  x|(*)  i*  the  chi-square  denaity  with  f  degrees  of  freedom. 
By  conditioning  on  the  denominator  of  (3.1)  one  sees  that 


F(k)  >  P(A  3  k) 

T( (nj.-*-  na)/2)  f1  ni/2-1  na/2-l 

.  -  x  (1  -  x) 

r  (nx/2)  r(na/2) 

(3.4)  J0 

j**  «(kt  -  6)fy(t*/(C1x+Ca(l  -  x)))  2t/(C1x+Ca(l  -  x))^dxdt 

T  ((na+  na )/2) 

T(na/2)  r(na/2) 

1  nj/2-1  n2/2-l  , 

x  (1  -  x)  Tni+n(k((nj+  njJCCiX  +  C2(l  -  x)))  ,  6)dx 


where  ♦  (•)  is  the  standard  normal  distribution  and  T(t,  4)  denotes  the 
noncentral  t  cumulative  with  f  degrees  of  freedom  and  noncentrality  parameter 
4.  i.e. 


(3.5)  Tf(t,  6) 


r(f /::) 


Tj  uf  *f(u 

Jn 


)  ♦(tu//f  -6)  du 


where  #(•)  and  *(•)  art  tha  standard  normal  danslty  and  cumulativa  raspactlva 

iy. 

For  tha  tolaranca  limit  problam,  lat 

(3.6) 

Cx  -  1/(1  -  1),  C,  -  l/( JR  +  1), 

6  -  K$(n(R  +  1 ) / ( JR  +  1))* 
and 

(3.7)  nx  ■  I  -  1.  n,  ■  I(J  -  1) 


whara  I,  J,  and  R  ara  as  in  Sac t ions  1  and  2.  Tha  valua  k(R)  such  that 
F(k)  ■  y  than  providas  an  axact  solution  to  tha  problam. 

Although  tha  abova  derivation  Is  simple,  it  is  apparently  not  wall  known. 
A  much  more  complicated  representation  of  tha  distribution  of  tha  random 
variable  (3.1)  is  developed  in  Ray  and  Pitman  (1961). 

A.  The  Effect  of  Pooling  on  tha  Coverage  Probability 

Tha  tolaranca  limit  procedure  discussed  in  Section  2  is  conservative  (i.a. 
providas  a  coverage  probability  greater  than  tha  nominal  valua)  when  the 
population  variance  ratio.  R,  is  small.  Mae  and  Owen  therefor  suggest  that 
data  be  pooled  and  a  single  sample  method  ba  applied  whan  tha  mean  square 
ratio  is  lass  than  1.  They  than  proceed  to  investigate  tha  conditional 
behavior  of  their  proposed  estimator. 

Using  tha  distribution  developed  in  Section  3.  one  may  determine  the 
coverage  probability  for  a  single  sample  procedure  applied  to  pooled  data  as  a 
function  of  the  variance  ratio.  This  result  will  be  used  to  determine  the 
unconditional  coverage  probability  of  the  Mee-Oven  method  in  Section  7. 

Lat  Yx  and  Ya  be  as  in  (3.2)  and  let  nx  and  n2  denote  the  between  and 
within  batch  degrees  of  freedom  respectively  (seo  3.7).  The  pooled  variance 
estimate  is 


where  n  denotes  the  pooled  sample  size,  I  the  number  of  batches,  J  the  batch 
site  and  y  the  grand  mean.  Partitioning  the  total  mean  square  and  substitut- 
ing  (2.9)  for  the  variance  of  y  ,  one  obtains 

(A. 2)  o’  Yj+  (Jo?  +  o’)Y,  n 

(S  /o  )*  -  -2 - 2 - 5 -  .  - 

v  o*  4-  o1  n-1 

D  6 

where  R  is  the  variance  ration  (2.13). 

If  k0  denotes  the  single  sample  tolerance  limit  factor  (e.g.  Owen,  1968, 
pp.  446-448),  then  the  coverage  proDability  as  a  function  of  R  is 

(4.3)  y(R)  -  P(y  •  k0s  s  r  K@ox) 

-  P((Z  +  Kpox/°p)  /  (S/ou)  S  ko) 

with  notation  as  in  Section  2.  Substituting  (4.2)  into  (4.3)  and  employing 
the  distribution  (3.4),  one  may  readily  examine  Y  (R)  numerically.  From  the 
typical  plot  in  Figure  1  it  is  apparent  that  the  coverage  probability  obtained 
may  be  substantially  less  than  the  nominal  value  even  for  small  values  of  R. 
Clearly,  criteria  which  result  in  the  decision  to  pool  must  be  considered 
carefully  if  one  is  to  be  assured  of  a  reasonable  tolerance  limit  estimate  in 
the  presence  of  batch-to-batch  variation.  Alternatively,  one  might  seek  an 
estimator  which  performs  well  for  ell  R,  eliminating  the  need  to  pool  alto¬ 
gether.  This  approach  will  be  taken  in  Section  6. 

5.  The  Solution  for  Unknown  Ri  Welch-Aspin  series 

For  unknown  variance  ratio,  the  tolerance  limit  problem  is  vary  closely 
related  to  the  Behrens-Fisher  problem.  Following  the  work  of  Welch  (1947)  and 
Trickett  and  Welch  (1934),  two  forms  for  a  solution  are  obtained. 

A  series  solution  is  developed  first.  While  computationally  simple,  the 
first  order  approximation  presented  here  is  anticonservative  and  may  only  be 
suitable  for  many  batches. 

Alternatively,  the  tolerance  limit  factor  as  a  function  of  the  mean  square 
ratio  may  be  obtained  approximately  as  the  solution  to  an  integral  equation. 
Although  this  requires  the  use  of  a  computer,  the  method  which  results  appears 
to  give  the  desired  coverage  probability  -  even  for  small  sample  sizes. 


To  simplify  the  notation  in  what  follows,  let  Sj 
be  the  mean  squares  and  o|  their  expected  values  for  1*1,  2,  i.e.  t 

S?  -  MS.  o?  -  Jo*  +  o2 

(5.0  b  b  * 

S?  -  HS  Oi  -  o’  . 

Given  the  two  mean  squares,  the  tolerance  limit  factor  may  be  expressed  in 
terms  of  the  standard  normal  distribution) 


y  -  P(u  -  kox  S  p  -  KpOx) 

(5.2)  -  E  „2(*(koY/(0l//n  )  -  6)) 

O  x  |  D  2  A 

-  E  -2  —  2  (*(h(Sf,  S|)  KoJJn)  -  6)). 

5>1  ,  “  2 


The  problem  is  to  determine  a  function  h(Sf,  S§)  -  kox  so  that  (5.2)  is 
approximately  satisfied  for  all  of  and  o\.  If  tolerance  limits  on  the  median 
are  desired,  then  6  »  0  and  the  results  of  Welch  (1947)  and  Aspin  (1948)  may 
be  used  directly.  If  6  is  not  sero,  the  idea  behind  the  Welch-Aspin  deriva¬ 
tion  may  still  be  applied,  though  the  algebra  is  considerably  messier. 

Following  Welch  (1947),  one  begins  by  expanding  the  normal  cumulative 
about  (oj,  o^)  and  recognizing  that  the  expectation  is  the  moment  generating 
function  of  the  product  of  two  independent  chi-squares  -  with  diffeiential 
operators  as  the  independent  variables  in  the  generating  functions.  If  one 
defines 


(5.5)  »4  *  sf, 

then 


o 


2 

i 


(5.4)  ♦  ( h ( S f ,  S^)  KoxW n)  -  6) 


-  n  e  (Si  °i)3i  ♦(h(Sj,  S|)  /(oj/v/n)  -  6). 

1-1 


Substituting  (5.4)  into  (5.2)  gives 


(5.5)  Y 


si-1 


where 


♦( h( S i »  Si)  \oJSn)  -  6)  dSl  iS\ 


(5.6) 


xi  (SJ)  dsj 


r(nt/2)2 


_  n./2*l  -n4S*/(2o*) 

TJl  (n±/o1)  1 


dtn^Sj/o^) 


ara  tha  danaitiea  of  tha  mean  squares  S*  and  the  n^  are  their  respective 
degrees  of  freedom. 

In  terms  of  the  operator 


2 

(5.7)  n  i  n  (1  -  2o}3./n.)  V2  e'°i3i  * 
i-1  111 

2 

1  +  I  ojd.2  /n. 
i-1  1 


the  tolerance  limit  problem  can  be  stated  as 


(5.8)  M(MSj,  Sj  )/(o  ;//n)  -  6)  -  y 

The  next  step  is  to  expand  the  normal  cumulative  about  (see  (2.5))  in 
a  second  Taylor  series,  giving 

(5.9)  ♦(h(S?  ,  S|  )  /(ojVn)  -  6) 


Si)  /(oj/Zn)  -  6 


(5.10)  Dr*(v)  5  *(v)  . 

y 


Express  h(Sf,  S|) 


ries  In  increasing  inverse  powers  of  the  degrees 


of  freedom 


(5.11)  h(Sf,  Si)  -  h«(S l,  S|)  4  hA(S?,  3|)  4  .. 


where  h^  (Si,  S|)  consists  of  terms  of  order  j  in  1/n^  for  1*1,  2.  One  can 


now,  in  principle,  solve  successively  for  the  h^'s.  If  terms  of  order  greater 


than  zero  arc  considered  negligible,  then 


(5.12)  h0(Si,  S|)  /(o^n)  -  6  -  K 


which  leads  to  a  zeroth  order  approximation  to  kt 


(5.13)  k0  -  h0(Si,  Si)/ox 

-  Kp  4  K^/ (1(1  4  (J  -  DSJ/S}  ))*. 


The  next  term,  h1(S{,  Si),  can  be  shown  to  be  the  solution  to 


(5.14) 


MSi/o:  *  1)D 
y  •  (14  |oja*/nt)  e  ' 


K  ((cx-  Oy)  /(o1//n))D 


h^Sj,  S|)  /(Oi//n)D 


♦(v). 


After  some  algebra,  the  first  correction  to  k0  is  seen  to  be 


(5.15)  kj  -  e/(4/I)  (K  (Kj  4  1)  /nx 


4  2K„K*  /I  /n ! e  4  K*K  I  /n,6J  4  KVl  /n.e1 

3  1  3  y  1  3  1 


4Ki  K  X(J-l)J/(n,MSRJ)82  4  Kfl( J-l) Vi  /(n:MSR2 )0* ) 
Pi  P 


'M 


W.W, 


Jvtw 


where 


(5.16)  8  i  (1/(1  +  (J  -  1)/MSR))* 
and 

(5.17)  MSR  i  Sj  /Si. 

The  coverage  probability  for  the  above  approximation  aa  a  function  of  the 
population  variance  ~atio  ia  plotted  in  Figure  2  for  a  (.90,  .95)  tolerance 
limit  and  J  ■  5.  Note  that  for  many  batchaa  tha  aariaa  solution  performs 
well,  though  for  few  batchaa  it  is  anti conservative. 

6.  An  Alternative  Solution  for  Unknown  R 

For  small  samples,  the  first  order  approximation  developed  above  may  not 
be  adequate.  An  alternative  approach  is  to  view  tha  problem  as  an  integral 
aquation,  following  Trickatt  and  Welch  (1954).  If  one  defines 

(6.1)  t  a  1/(JF  +  1) 

than  (3.4)  may  be  written  as 

r((nl+  na)/2)  f1  n^/2-1  n2/2-l 

(6.2)  y  -  -  x  (1  -  x) 

r(n1/2)r(na/2) 

Jo 

Tni+£k(T)((n1+  n,)(I/(I  -  l)x  +t(1  -  x)))^,  &)dx 

where 

(6.3)  6  -  ZnKpB  -  K$((I  +  (J  -  1))*t 

and  B  is  as  defined  in  (2.12).  The  parameter  t  may  he  estimated  by  the 
reciprocal  of  the  mean  square  ratio  (4.17)t 

(6.4)  u  s  1/MSR  -  tF 

na  •  ni 
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where  Fn2,  nx  denotes  a  random  variable  with  an  F  distribution  with  n2  and  n: 
degrees  of  freedom.  The  tolerance  limit  problem  reduces  to  determining  a 
function  k(u)  such  that 

(6.5)  y  -  P((Z  +  S(t))  /(I/(I  -  1)YX  +  tY2)*  i  k(u)) 

-  P(Z  i  k((n1/n2)(Y2/Y1))  (1/(1  *  1)YX  +  tY2)*-  6(t)) 

where  Z,  Yx  and  Y2  are  as  in  Section  3.  This  is  equivalent  to  the  inteqral 
equation 

•  j 

r((n2+  na)/2)  f1  nx/2-l  na/2-l 

(6.6)  y  ■  -  x  (1  -  x) 

r(nx/2)r(na/2) 

Jo 

T  .  (k(c)((nx+  na)(I/(I  -  l)x  +  t(1  -  x)))^,  6(x))dx 

nx+n2 


where ' 

(6.7)  c  ■  nx(l  -  x)  / (nax)t . 

Using  the  results  of  Section  5,  one  may  define 


(6.8)  k(c)  ■  k0(c)  +  tkx(c ) 

where  k0(c)  is  the  first  order  approximation  from  the  Welch-Aspin  procedure 
and  kx(c)  is  an  unknown  function.  If  an  approximation  to  kx(c)  can  be  ob- 
tained  this  approximation  may  lead  to  an  improved  k0(c). 

Letting  V(>)  represent  the  functional  (6.6),  if  one  expands  V(«)  in  a 
Taylor  series  about  c  ■  0  one  obtains  the  first  order  approximation 

(6.9)  y  -  V  (k0(c))  +  ckx(c))  -  V  ( k o ( c ) )  ♦  c  e-Q 

Since  t  is  arbitrary,  it  may  be  taken  tr  equal  one.  The  approximation  may 
then  be  written  as 
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(6.10)  y  ■  V(k0)  + 


r((n!+  n2)/2) 
r(n1/2)r(nI/2) 


1n1/2-l  n2/2-l~  , 

x  (1  -  x)  kj(c)((nx+  n2)(I/(I  -  l)x  +  t(1  -  x)))1* 

C 

tni+nj(kl(c)((n1+  na )<!/(!  -  l)x  +  t(l  -  x)))^,  d)dx 


where  t^(*,  6)  denotea  tha  noncantral  t  denaity.  Tha  noncantral  t  density 
with  f  dagraaa  of  freedom  and'  noncantrality  parameter  6  may  ba  calculatad  by 
maana  of  tha  following  formula  (Odah  and  Owen,  1980,  p.  272) t 


tf(x,  6) 

(6.11) 


(f/x)  (Tf^((f  +  2)/f)*x,  d) 
-  Tf(x,  5)). 


Tha  firat  tarm  on  tha  right-hand  aida  of  (6.10),  V(k0),  may  ba  evaluated 
numarlcally  for  glvan  t  alnca  k0(c)  la  a  known  function.  Tha  aacond  Integral 
la  concantratad  about  n2/(n2  +  n2).  If  k2(c)  la  avaluatad  at  thia  valua,  tha 
remainder  of  thia  integral  may  alao  be  avaluatad  numerically.  Note  that 


(6.12)  k i ( c ) 


x  ■  n1/(nl+n2)  “ 


ao  that,  with  obvioua  notation  for  tha  two  integrala  to  be  evaluated  numerl- 
cally, 

(6.13)  y  •  V,  +  k2 ( t )  V2 
i.e. , 

(6.14)  k i ( t )  -  (y  -  V0)  /Vi. 

Since  k(c)  in  the  eame  function  of  c  that  k(T)  ia  of  t  one  may  uae  a  firat 
approximation  k0(c)  to  gat  a  new  approximation  ki(c)  by  evaluating  (6.11)  for 
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•  mesh  of  t  values.  This  kj  becomes  the  k0  for  the  next  iteration.  Al¬ 
though  it  is  certainly  not  obvious  that  such  a  procedure  will  converge,  or 

•  Jen  that  a  solution  exists,  it  vill  be  shown  below  that  this  algorithn 
appears  to  provida  a  solution  to  the  tolerance  limit  problem  that  is  (for 
practical  purposes)  exact. 

• 

7.  Discussion 

The  situation  of  primary  interest  to  the  aircraft  industry,  (.90,  .95) 
lower  tolerance  limits,  is  the  only  case  yet  examined  in  detail.  Four  methods 
have  been  presented  in  this  papen  the  Mee-Owen  method  (Section  2),  a  modified 
Mae-Owen  method  (Section  2),  a  method  baaed  on  the  Welch-Aapin  series  (Section 
5)  end  a  method  based  on  the  solution  of  an  integral  equation  (Section  6). 
The  coverage  probability  functions  corresponding  to  these  methods  are  numbered 
1*4  in  Figure  3  for  five  batches  each  of  sise  five. 

The  integral  equation  solution  is  for  most  practical  purposes  an  exact 
solution  to  the  problem.  The  Mee-Owsn  method  has  the  disadvantage  of  being 
substantially  conservative  when  the  variance  ratio  is  small.  • 

Only  a  modest  reduction  in  this  conservative  has  resulted  from  the  modifi¬ 
cation  of  the  confidence  level  of  the  variance  ratio  estimate  (Section  2, 
Table  2). 

The  Welch-Aspin  aeries  solution  is  clearly  not  suitable  for  as  few  as  fiva 
batches,  aa  discussed  in  Section  5.  However,  it  ia  easy  to  compute  and 
provides  an  adequate  starting  function  for  the  iterative  solution  of  the 
integral  equation  (6.11). 

From  the  rescaled  plot  of  the  coverage  probability  funct  Lon  for  the 
Integral  equation  solution  (Figure  4)  it  can  be  seen  that  for  K  >  1  the  actual 
coverage  probability  differs  from  .95  by  no  more  than  ±  .00005.  This  snail 
difference  can  be  attributed  to  roundoff  error.  For  R  <  1.  however,  the 
difference  in  the  actual  and  nominal  coverage  probability  indicates  that  the 
convergence  lo  not  uniform.  The  convergence  of  the  successive  approximations 
to  the  tolerance  limit  factor  needs  to  be  more  thoroughly  examined,  though  the 
practical  gain  from  such  an  investigation  may  be  slight. 

S.  Example 


The  deta  in  Table  3  are  a  pseudo-random  sample  of  25  from  a  normal  distri¬ 
bution  with  mean  50  and  standard  deviation  10.  These  data  have  bsen  arbitrar- 
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ily  grouped  into  five  batches  of  five.  By  fitting  a  one-way  random  effects 
model  to  these  data  one  obtains  (2.1  -  2.4)  : 

M  -  48.30 

(8.1)  MSb-  89.88  o*  -  144.9 

MS  -  158.6  . 

6 

A  (.90,  .95)  lower  tolerance  limit  is  of  the  form 

(8.2)  T  -  ti  -  Kox. 

For  the  method  of  Mee  and  Owen  (1983)  K  ■  1.90.  If  the  Mee-Owen  method  is 
modified  as  suggested  in  in  Section  2,  then  K  only  decreases  to  1.89.  The 
series  solution  of  Section  5  gives  K  -  1.78  end  the  integral  equation  of 
Section  6  results  in  K  ■  1.83.  The  tolerance  limit  estimates  are,  respective¬ 
ly,  25.42,  25.54,  26.82  and  26.29.  These  values  may  be  compared  with  the 
tolerance  limit  estimate  for  the  pooled  data,  which  is  26.00. 

9.  Conclusion 

One-sided  tolerance  limits  for  random  effects  models  is  a  topic  of  consid¬ 
erable  importance  in  engineering  statistics.  The  purpose  of  this  paper  has 
bean  to  consider  this  tolerance  limit  problem  from  the  point  of  view  of  the 
Uelch-Aspin  interpretation  of  the  Bshrens-Fishsr  problem.  This  approach  leads 
to  what  will  very  likely  prove  to  be  a  solution  which,  for  the  purposes  of 
applied  statistics,  is  exact.  Some  numerical  work  remains  to  bo  done,  leading 
to  the  preparation  of  tables  to  be  presented  in  a  subsequent  publication. 
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ABSTRACT 

n  balls  art  randomly  distributed  In  N  cells,  so  that  no  cell  may  contal 
more  than  one  ball.  This  process  Is  repeated  m  times.  In  addition,  balls 
may  disappear;  such  disappearances  are  independent  and  Identically 
Bernoulli  distributed.  Conditions  are  given  under  which  the  number  of 
empty  cells  has  an  asymptotically  (N*»)  standard  normal  distribution. 
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Li  INTRODUCTION 


The  distribution  of  the  number  of  empty  cells  In  the  following 
random  allocation  process  Is  considered.  Let  n,  N  be  positive  Inte- 
gers  with  n  s  N.  Assume  that  n  balls  are  randomly  distributed  Into 
N  cells,  so  that  no  cell  may  contain  more  than  one  ball.  Then,  the 
probability  that  each  of  n  specified  cells  will  be  occupied  Is  Q  . 
This  process  Is  repeated  m  times,  so  that  there  are  (n)  fandom 
allocations  of  nm  balls  among  the  N  cells.  In  addition,  for  each 
ball,  let  p,  0  s  p  t  1,  be  the  probability  that  the  ball  will  not 
"disappear"  from  the  cell.  The  "disappearances"  are  assumed  to  be 
stochastically  Independent  for  each  ball}  thus  the  disappearances  con¬ 
stitute  a  sequence  of  nm  Bernoulli  trials. 

Several  special  cases  of  this  problem  have  previously  been  con¬ 
sidered  .  In  particular,  p  ■  1,  n  ■  1  Is  the  classical  occupancy 
problem,  tee  [2], [3], [10],  The  case  p  ■  1,  n  arbitrary  has  been 
discussed  In  [4]  and  [7],  The  case  0  <  p  <  1,  n  ■  1  Is  treated  In 
C.  0.  Park  [6], 

In  this  paper,  we  obtain  the  probability  distribution  and  moments 
of  the  number  of  empty  cells.  In  section  3,  we  show  that  the  number  of 
empty  cells  may  be  represented  as  a  sum  of  Independent  Bernoulli  ran¬ 
dom  variables.  This  representation  permits  us  to  determine  conditions 
on  m,  n,  p,  N  such  that  the  number  of  empty  cells  Is  asymptotically 
normally  distributed. 

This  random  allocation  p> ocoss  may  be  viewed  as  a  filing  or 
storage  process.  Objects  are  randomly  assigned  to  files  or  storage  bins. 
From  time  to  time,  objects  may  be  missing  or  have  disappear'd. 
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Let  n,n,N  be  positive  Integers  with  n  t  H.  m  sets,  each  con¬ 
sisting  of  n  bells,  ere  distributed  Into  N  cells  at  random  so  that 

i 

no  cell  oan  contain  more  than  one  ball  from  the  same  set.  As  each  set 
is  distributed,  the  balls  that  have  been  placed  during  the  preceding 
distributions  are  left  In  the  cells.  Thus,  at  the  end  of  the  process, 
cells  may  contain  as  many  at  m  balls.  In  addition,  each  ball  may 
"disappear"  with  common  probability  1  -  p,  Dips).  These  disappear¬ 
ances  are  stochastically  Independent  and  thus  constitute  a  sequence  of 
mn  Bernoulli  trials. 

Let  n  r  »U)  be  the  probability  that  exactly  J  of  the  N 
cells  are  empty. 

We  now  establish  the  following  theorem. 
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Proof.  From  J.  Rlordan  [9]*  p.  53,  from  (4),  It  follows  Immediately 
that 


E($(v))  ■  (|J)vl(JJ)'m[ j  Ot^)(1-P)V  •  (?) 

We  thus  obtain  tVe  following. 

Corollary.  E(S)  «  N(1  -  ,  (8) 

°!  *  N(N.))[i.N=8j^-’) .  ♦  (i-P)*t{Etyi* 

+  No  •^)*n-N(i-^pn  .  (9) 


Proof.  ’  From  (7) 

e(s)  -  N(Jj)"’((H;’).(J[;])n.p))B .  no  . 


Since 


oj  -  E($(,))*E(S)  -  (E(S))2  , 

the  conclusion  follows  readily  from  (6),  *fter  some  elementary  calcula¬ 
tions. 


For  some  purposes,  the  following  equivalent  forms  of  (9)  will 


prove  useful. 


°!  •  n(n-i)[i  .  *  ho  -fni-No  .fn  m 


end 


■*>*£«  - 


1) 


+ 


N(i  -^Ml  -  (1  -^ni  - 


np*(N«n)  -m 
(N-l)(N-pn)|J  ' 


From  Theorem  2,  we  readily  obtain  the  following. 


01) 


Theorem  3.  The  factorial  moment  generating  function  of  S  Is  given  by 

♦.«*)  • E(ut)S  •  .  <u) 

Note  that  $m(t)  Is  a  polynomial  In  t  of  degree  N.  This  fact  Is 
exploited  In  the  next  section,  where  the  asymptotic  distribution  of 
S  Is  obtained.  In  particular, 

♦  0(t)-(l+t)N  (13) 

and 


We  now  Investigate  the  ksymptotlc  distribution  properties  of  the 
number  of  empty  cells. 
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3.  THE  ASYMPTOTIC  DISTRIBUTION  OF  THE  NUMBER  OF  EMPTY  CELLS 


In  this  section,  we  determine  conditions  under  which  the  number  of 
empty  cells  (when  suitably  normalized)  has  an  asymptotically  normal  distri¬ 
bution.  In  order  (tc  establish  this,  a  number  of  preliminary  results  are 
required. 

Lemma  1 .  Let  N,n,r  be  non-nogatlve  Integers,  r  s  n  *  N.  Then 


Proof.  Since  (^)  ■  0  whenever  v  <  a  ,  we  can  write 


The  conclusion  follows  Immediately 


Lemma  2. 
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Proof.  The  right-hand  side  of  (16)  may  be  written 
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Thus,  the  coefficient  of  p^  Is 
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From  Lemma  1 , 


from  which  the  conclusion  follows  Immediately.  Employing  the  above 
lemmas,  we  can  now  establish  the  following  theorem. 
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Theorem  3.  The  factorial  moment  generating  function  of  the  number 
of  «npty  cells  <j>m(t)  (12)  satisfies  the  following  differential- 
difference  equation, 
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The  conclusion  now  follows  from  Lemma  2. 
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Then,  from  Theorem  3,  we  have  that 


♦m+1(t)  B  T^m{t))»  «0(t)  ’  (1+t)  ’ 


Lemma  3.  Extend  the  domain  of  T  to  the  complex  plane,  letting 
z  ■  x  ♦  1y»x,y  real.  Let 
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Proof.  Let 


C  ■  {z:|z  +  (c-lY)|  s  [(c-a)2  +  c-j(a+b)}. 


Clearly  -a  and  -b  are  on  the  boundary  of  the  circular  region  CY 
Consequently  all  zeros  of  i^(z)  are  in  ty  .  Let  z*  be  a  zero  of 
<Kz).  Let 
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Consequently,  C  u  Bp  Is  contained  In  the  Interval  (21),  proving 


the  lemma. 


We  now  establish  the  following  theorem. 

Theorem  4.  Let 

yt>  ■  jo  . 

Let  tj  ^t^  \...,t^m^  be  .the  zeros  (not  necessarily  distinct)  of 
♦m(t).Then  tjm>,  J  -  1,2,...,N  are  all  real  and 
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Proof.  From  (19) . 
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The  zeros  of  $Q(t)  are  t|0^  ■  t ^  •  -1.  The  zeros 

of  c^t)  are  tj1^  •  -l,...,tJ]J  •  -1,  tj^+1  •  -l/(1-p),..., 
t^  ■  -1/(l-p).  Now  apply  lemma  3  with  ^(z)  ■  <^(z)  obtaining 
a  •  1,  b  ■  (1-p)*\  Then,  the  zeros  of  $g(t)  are  real  and  satisfy 


-(l-p)"2  s  tj2)  s  -1,  J  •  1,2 . N  . 

It  then  follows  readily  by  Induction  that  the  zeros  of  ok(t)  are 
real  and  satisfy 

-(l-p)'k  i  tjk)  s  -1,  j  ■  1,2,. ...N,  k  -  2,3 . 


Theorem  5.  For  1  s  n  s  N,  0  s  p  s  1 ,  m  a  1 ,  S  has  a 

representation  as  the  sum  of  N  mutually  Independent  Bernoulli  random 
variables.  That  Is,  there  exist  mutually  Independent  Bernoulli  random 
variables,  Yj  ■  Yj(N,n,p,n),  j  »  1,2, ...,N,  such  that 
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and 
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(27) 
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Proof. 


Lit  Y  be  a  Bernoulli  random  variable  with  P{Y  ■  1}  ■  t. 


Than  thi  factorial  momtnt  ganaratlng  function  of  Y  1* 

EY{(Ut)Y)  •  (Hxt). 

If 

N 

W  •  l  Y«  . 

1-1  3 

whara  W*,,,YN  mutually  Indapandant  Barnoulll  random  varlablii 
with  P{Yj  ■  1}  *  t j i  than  thi  factorial  momant  ganaratlng  function  of 
W  It 

N  N 

E(t)  ■  Ey{(Ht)W}  •  n  ey  {(Ht)YJ)  •  n  (i+x.t),  (28) 

^  >1  TJ  J-l  J 


whari  0  *  x^  s  1,  J  •  1,2,.. .,N.  From  Thioram  4,  tha  factorial 
momant  ganaratlng  fraction  of  S  may  ba  wrlttan 

*m(t)  -  (l-p)™  n  (t-t(,m)),  m  -  0,1 .  (29) 

m  J-l  J 

whara  tjm^  ara  rial  and  -(l-p)’m  a  t!jm^  <  -1 ,  J  ■  1,2,...,N. 

Slnca  avary  polynomial  of  dagraa  N  with  rial  roots  has  a  uni  qua 


raprasantatlon  of  tha  form 


the  representation  follows  by  sotting  Tj  ■  *(tjm^)"^  and  noting 
that  C(0)  ■  $m(0)  ■  1. 

Lot  <£  ■  <&(n,N,m ,p)  be  the  cumulants  of  S  and  let 
bo  the  factorial  cumulants  of  S.  That  Is, 

m 

log  dm(t)  ■  l  K^^tv/vl  . 

Then 

***• 

wh#r#  6J,t  ar#  thi  St1r*1n9  numbors  of  tho  second  kind. 

Then,  as  N  , 


V  •  (S  -  E(S))/os 


Is  asymptotically  distributed  by  the  standard  normal  distribution 
(maan  0,  variance  unity),  whenever 


From  (29), 


N  N 

log  $_(t)  •  nm  log(l-p)  ♦  l  log(t-t^)  •  J  log(HT.t) 
m  1*1  J  1«1 


Thus, 


•  i  i  luiL  <-ir  . 

1»1  k«1  k 


<[v]l/vl  lT)l  ‘"'v 


■jS,  8J.»  “031  ‘  et  Nl 


sine*  th«  6j  ^  do  not  d«p«nd  on  N,n,m,  or  p 
Ut  now  •stabllsh  th«  following  thooren. 


Mmmmgmmmmmmmmm 


•  *  ■  n*  • 


•v*v 


Thao ran  6.  V  ■  (S-E(S) )/os  has  an  asymptotically  standard  normal 
distribution  as  N  »,  whanavar  any  of  tha  following  conditions  ara 
satlsfl  ad. 


i.  aja-o,  p-p*im  tnd  Jjlj^ 


«•  T'°-  (1-p)  ■*  0  so  that  for  soma  c  >  0, 


(1-p)  ■  c(^jp)  +  o((!2^)  ),  0  <  p  <  1 ,  and 


m 


Proof.  From  (11),  we  can  write,  for  o  •*  0  , 


<2  ■  N(e"a)(l-e’a  -  ope"a)  ♦  O(npa)  +  0(paaa) 


where  a  ■  .  Then,  as  mO, 


k,  •  N(1-a+aa/2)(a-£«--ap +  aap)  +  0(Ncta)  +  0(mna) . 


Then,  If  p  *  p*  f  1  , 


<2  ■  Na(l-p)  ♦  0(Naa) 


whenever  -tW* 
N 


Similarly,  If  (1  -  p)  -  c(Ejj£)P  ♦  C((^)P),  0  <  p  <  1 , 


c  >  0  , 


<2  *  Na(l  •  p)  +  o(Na(l  -p)) 


r< 
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APPLICATION  OF  EXPERIMENTAL  DESIGN  TO  THE  EVALUATION  Of  EXPERT  OPINION 


Franklin  E.  Womack  and  Carl  6.  Bates 
US  Army  Concepts  Analysis  Agency 
Bethesda,  Maryland  20614-2797 

ABSTRACT .  Expert  opinion  can  be  a  valuable  source  of  information  to 
tap  In  the  building  of  a  systems  model.  At  tne  US  Army  Concepts  Analysis 
Agency  (CAA),  the  computer  mooel  FORCEM  (Force  Evaluation  Model)  Is  used  to 
evaluate  the  theater-level  combat  system-.  FORCEM  is  built  and  maintained 
by  a  group  of  CAA  analysts.  The  command  and  control  part  of  FORCEM  is  a 
logical  surrogate  for  the  field  commander  at  various  levels  of  combat 
(l.e.,  theater,  army,  corps,  or  division).  A  simulated  war  <s  conducted  by 
exercising  FORCEM.  The  command  and  control  part  of  FORCEM  is  allowed  to 
perceive  Information  about  the  state  of  the  war  through  a  perception  data 
base.  Using  the  information  from  the  data  base,  It  applies  decision  rules 
for  the  further  conduct  of  the  war.  In  order  to  validate  these  decision 
rules  and  make  enhancements  to  the  present  model,  81  students  at  the  Army 
War  College,  Carlisle,  Pennsylvania,  participated  In  an  information 
gatnering  experiment.  Several  decisions  from  the  command  and  control  part 
of  FORCEM  were  presented  to  these  experts  in  the  form  of  a  structured 
experimental  design.  Information  from  the  perception  data  base  served  as 
factors  for  the  experimental  design,  and  responses  were  solicited  from 
these  experts.  This  paper  discusses  the  experimental  design  employed  and 
the  statistical  analysis  performed. 

Comments  by  panelists  Drs.  Kaye  Basford  and  K.  T.  Federer  ere  at  the 
end  of  thi s  artical . 


1.  INTRODUCTION.  The  US  Army  Concepts  Analysis  Agency  developed  the 
Force  Evaluation  Model  (FORCEM)  during  the  period  1982-1935.  FORCEM  Is  a 


fully  automated  computer  simulation  of  a  conventional  theater  campaign 


treating  combat,  combat  support,  and  combat  service  support  In  a  theater  of 


operations.  The  model  Is  used  In  studies  of  the  capabilities  of  current 


combat  forces;  requirements  for  support  forces;  and  requirements  for 


personnel,  supplies,  and  major  Items  of  equipment. 


FORCEM  Is  a  time-sequenced  model;  each  cycle  represents  12  hours  of 


simulated  time.  At  the  beginning  of  each  cycle.  Intelligence  and 


communications  determine  a  set  of  perceived  data  for  each  headquarters 
unit.  Based  on  these  data,  command  and  control  (C2)  decisions  are  made. 


Then  the  activities  of  the  cycle  are  represented:  combat  movement  and 


combat  service  support. 


Command  and  control  representation  depends  on  a  perception  data  base 


and  decision  algorithms.  The  decision  algorithms  are  built  Into  the  model 


and  Involve  a  set  of  Input  threshold  parameters.  This  paper  addresses  a 
study  of  the  C2  decision  algorithms. 


2.  PROBLEM  DESCRIPTION.  The  purpose  of  the  study  was  to  verify  or 
enhance  the  C2  decision  algorithms  of  FORCEM.  The  decision  algorithms 


considered  are  Identified  In  Table  1.  Each  decision  algorithm  was  examined 


and  the  factors  within  the  algorithm  were  Identified.  Naturally,  some 


factors  are  contained  In  more  than  one  algorithm  T'  e  12  unique  factors 


Involved  In  the  8  decision  algorithms  are  listed  In  Table  2. 


'o' 


hV.; 


m 


Table  1.  FORCEM  Decisions  Considered 


Number 

Decision 

1 

Assignment  of  New  Corps 

1 

2 

Assignment  of  New  Division 

3 

Assignment  of  New  Field  Artillery  Battalion 

4 

Designation  of  Posture  of  Online  Corps 

5 

Specification  of  Priority  to  Corps  for  CAS 

6 

Specification  of  Priority  to  Corps  *or  CSS 

7 

Specification  o*  Priority  to  Division  for  CAS 

8 

Specification  o*  Priority  to  Division  for  CSS 

Table  2.  Decision  Factors 

Symbol 

Factor 

A 

Has  Reserve  Corps 

B 

Corps  Engagement  Status 

C 

Corps  Force  Ratio 

D 

Location  of  objective  of  Corps/Posture  of  Corps 

E 

Echelon  to  Wfcicn  Corps  A« signed/Has  Reserve  Corps 

F 

Echelon  to  Which  Corps  Assigned 

G 

Ratio  of  Corns  Artillery  Battalions  to  Divisions 

K 

Location  of  Objective  of  Corps 

I 

Posture  of  Parent  Army 

J 

Division  Equipment  Status 

k 

Division  Force  Ratio 

L 

tchelon  to  Which  Division  Assigned 

The  levels  of  each  of  the  factors  are  given  In  Table  3.  Factor  D  is 
actually  a  combination  of  two  factors  (Objective  Location  and  Posture); 
however,  all  factor-level  combinations  of  the  two  factors  did  not  exist. 
Only  the  six  combinations  shown  In  Table  4  existed. 


Table  3.  Levels  of  Decision  Factors 


actor 

1 

2 

Level 

3 

4 

5 

6 

A 

No  res 

Has  res 

B 

No  res 

Engaged 

C 

1:3 

1:1 

3:1 

D 

Rear/ 

Wlthdr 

Reached/ 

Delay 

Reached/ 

Defend 

Fwd/ 

Delay 

Fwd/ 

Defend 

Fwd/ 

Attack 

E 

Reserv 

Onln/Yes 

Onln/No 

F 

Reserv 

Online 

G 

1.00 

0.25 

H 

Rear 

Reached 

Forward 

I 

Delay 

Defend 

Attack 

J 

No 

Yes 

K 

1:3 

1:1 

3:1 

L 

First 

Second 

Third 
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Table  4.  Possible  Location/Posture  Combinations 


Objective 

location 

Withdraw 

Posture 

Oelay 

Defend 

Attack 

Rear 

1 

Reached 

2 

3 

Forward 

4 

5 

6 

All  the  factors  within  each  experiment  were  completely  crossed  within 
all  other  factors  of  the  experiment.  Consequently,  all  designs  were 
factorial  designs.  The  factors  end  the  number  of  cells  are  shown  In  Table 
S.  The  sizes  of  the  experiments  range  from  12  to  108  cells. 


Table  5.  The  Eight  Experiments 


Decision 

Number  of 

Number  of 

number 

Factors 

levels 

cells 

For  each  experiment,  a  questionnaire  was  developed  that  described  the 
scenarios  defined  by  the  factor-level  combinations  (cells).  Subjects 
(military  officers)  were  asked  to  assign  a  criticality  Index  (from  0  to 
100)  except  for  decision  number  4.  For  decision  4,  subject's  response  was 
one  of  the  four  postures— Withdraw,  Delay,  Defend,  or  Attack. 

3.  TEST  METHOD.  The  approach  was  to  use  students  at  the  US  Army  War 
College  as  subjects,  use  computerized  questionnaires  for  each  of  the  eight 
decisions,  and  collect  data  from  the  Army  "experts"  concerning  the 
criticality  of  each  of  the  scenarios  of  each  of  the  eight  decisions. 

To  test  the  feasibility  of  the  planned  approach,  a  pilot  test  was 
conducted  Inhouse.  Decision  number  1,  which  Involves  factors  A,  6,  C,  and 
D,  was  selected  for  the  pilot  test.  Nine  senior  officers  of  the  US  Army 
Concepts  Analysis  Agency  were  selected  as  subjects.  In  the  pilot  test, 
only  the  high  and  low  levels  (1:3  and  3:1)  were  used  for  factor  C  (corps 
force  ratio).  Five  to  ten  practice  questions  (Figure  1)  were  given  before 
the  48  questions  of  the  2x2x2x6  design  were  given  to  allow  for  any  learning 
effect.  To  assess  the  subject  effect,  Subjects  (S)  were  treated  as  a 
random  factor  (factors  A,  B,  C,  and  D  were  fixed).  Five  of  the  cells  were 
replicated  to  provide  an  estimate  of  within  error  variance. 


YOU  WILL  BE  ASKED  TO  RESPOND  BY  ENTERING  A  NUMBER  BETWEEN  0  and  100 
based  on  the  following  scale  of  how  critical 
you  think  It  Is  for  the  newly  arrived  CORPS 
to  be  assigned  to  reserve  status  behind  the 
ONLINE  CORPS.  After  entering  a  number  hit  1 XMIT * . 


0  20  40  60  80  100 


NOT  SLIGHTLY  MODERATELY  VERY  EXTREMELY 

CRITICAL  CRITICAL  CRITICAL  CRITICAL  CRITICAL 


PLEASE  HIT  'XMIT'  NOW  TO  PROCEED 


PAUSE  00000 
WARMUP  NUMBER  #  1 

1.  There  is  currently  at  least  one  CORPS  assigned  In  reserve  behind  the 
ONLINE  CORPS. 

2.  The  ONLINE  CORPS  Is  currently  engaged. 

3.  The  location  of  the  parent  Army's  Objective  Phase  Line  Is  now  located 
at  the  present  position  of  the  ONLINE  CORPS'  current  forward  phase  line. 

4.  Assuming  all  divisions  currently  assigned  to  the  ONLINE  CORPS  are  In 
place,  the  current  posture  of  the  ONLINE  CORPS  Is  defend. 

5.  Assuming  all  divisions  currently  assigned  to  ONLINE  CORPS  are  In  place, 
the  frlendly-to-enemy  combat  worth  force  ratio  Is  currently  perceived  to  be 
FRIEND : ENEMY  (1:3) 

PLEASE  RESPOND  BY  ENTERING  A  NUMBER  BETWEEN  0  AND  100 
based  on  the  aforementioned  scale  of  how  critical 
you  think  It  Is  for  the  newly  arrived  CORPS 
to  be  assigned  to  reserve  status  behind  the 
ONLINE  CORPS.  After  entering  a  number  hit  'XMIT'. 

PLEASE  ENTER  THE  NUMBER  NOW. 


Figure  1.  Sample  Question 


An  analysis  of  variance  (ANOVA)  was  performed  on  the  data.  The  ANOVA 

model  was 

y-M+A+B+C+D+S 
+  AB  +  AC  +  . . .  +  OS 

+  ABCDS  +  R, 

where  y  represents  criticality  Index;  u  Is  a  true  but  unknown  constant;  A, 
B,  C,  0,  and  S  are  as  defined  above;  and  R  Is  residual*  All  effects 
Involving  S  were  tested  over  MS(R),  and  all  fixed  effects  were  tested  over 
their  corresponding  Interaction  with  S.  That  is,  the  F-ratlo  for  testing 
the  factorial  effect  of  factor  A  is  MS(A)/M$(A$) ,  and  the  F-ratlo  for 
testing  the  AB  Interaction  effect  Is  M$(AB)/MS(ABS).  Some  of  the  Subject 
variance  components  were  statistically  significant;  however,  the  four  fixed 
effects  factors  accounted  for  over  60  percent  of  the  total  variability. 


The  ANOVA  results  were  used  to  give  a  hypothesized  "significant"  model 
for  fitting.  Dummy  variables  were  used  for  the  qualitative  factors  and 
regression  analysis  was  used  to  develop  a  prediction  equation.  This 
prediction  equation  provided  the  model  to  be  compared  with  the  current 
FORCEM  algorithm  for  the  particular  decision.  The  comparison  Is  shown  In 
Table  6,  which  contains  the  regression  model  predicted  values,  the  48  cell 
means,  and  the  current  algorithm  priority.  The  first  and  the  forty-eighth 
priorities  of  all  three  priorities  agree.  Also,  the  first  six  to  seven  and 
the  last  five  of  the  regression  model  and  cell  mean  priorities  agree. 


Table  6.  Comparison  of  Models 


Regression  model 
predicted  value 
and  priority 

Cell  mean 
critical  Index 
and  priority 

Present 

FORCEM 

priority 

1 

99.4 

1 

96.2 

1 

2 

92.6 

2 

91.4 

13 

3 

79.8 

3 

79.3 

25 

4 

72.9 

4 

77.8 

37 

5 

64.6 

5 

67.1 

3 

6 

57.8 

6 

65  .0 

15 

7 

54.4 

9 

r-4.5 

5 

8 

53.8 

7 

62.1 

7 

9 

53.6 

8 

58.4 

2 

10 

47.5 

12 

43.4 

17 

11 

47.0 

14 

43.3 

19 

12 

46.8 

10 

47.8 

14 

13 

45.4 

11 

44.5 

9 

14 

45.0 

13 

43.4 

27 

15 

41.8 

15 

40.0 

26 

16 

38.6 

17 

35.8 

21 

17 

38.1 

21 

32.4 

39 

18 

37.5 

18 

35.6 

11 

19 

35.0 

22 

31.1 

38 

20 

34.8 

16 

37.7 

29 

21 

34.2 

25 

30.0 

31 

22 

33.8 

19 

34.4 

4 

23 

31.1 

20 

32.5 

8 

24 

30.7 

24 

30.5 

23 

25 

27.9 

27 

26.7 

41 

26 

27.3 

30 

24.7 

43 

27 

26.9 

28 

25.3 

16 

28 

26.1 

26 

28.4 

6 

29 

25.8 

23 

30.6 

33 

30 

25.4 

31 

22.3 

10 

31 

24.3 

29 

25.1 

20 

32 

22.0 

37 

17.7 

28 

33 

21.3 

33 

19.0 

12 

34 

19.2 

40 

14.7 

32 

35 

19.3 

36 

18.0 

18 

36 

18.9 

32 

20.1 

45 

37 

18.5 

38 

17.7 

22 

38 

17.9 

34 

18.1 

35 

39 

15.1 

35 

18.1 

40 

40 

14.5 

44 

12.8 

24 

41 

14.3 

41 

14.6 

30 

42 

13.5 

42 

13.0 

34 

43 

12.5 

39 

17.2 

44 

44 

11.0 

43 

13.0 

47 

45 

9.5 

45 

10.7 

36 

46 

7.5 

47 

8.5 

42 

47 

6.7 

46 

8.8 

46 

48 

2.7 

48 

5.5 

48 
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The  regression  model  equation  was  considered  to  be  an  adequate  fit  of 
the  dat*  for  the  Intended  purpose.  Consequently,  the  pilot  test  was 
considered  successful,  despite  the  fact  that  the  results  of  the  developed 
models  were  Inconsistent  with  the  algorithm  priorities.  The  decision  was 
made  to  proceed  with  the  project  as  planned. 

4.  DATA  COLLECTION.  A  questionnaire  was  computerized  for  each  of  the 
eight  decisions  In  Table  1  and  administered  to  a  group  of  students 
(Subjects)  from  the  US  Army  War  College.  The  experiments  were  administered 
on  four  afternoons  during  December  1985  and  January  1986.  Each  afternoon 
consisted  of  two  2-hour  sessions  with  approximately  10  subjects.  The 
allocation  of  subjects  to  experiments  Is  shown  In  Table  7.  Experiments  1, 
2,  3,  7,  and  8  were  administered  to  20  subjects,  experiments  5  and  6  were 
administered  to  21  subjects,  and  experiment  4  was  administered  to  all  81 
subjects. 


Table  7.  Allocation  of  Subjects  to  Experiments 


Group 

(session) 

Decision 

1  2  3  4  5  6  7  8 

Number 

of 

subjects 

1 

X  XX 

10 

2 

X  X  X 

10 

3 

X  X  X 

10 

4 

X  X 

10 

5 

X  X  X 

10 

6 

X  X 

10 

7 

X  XX 

10 

8 

XXX 

11 

5.  ANALYSIS.  In  addition  It  was  recognized  that  the  present  FORCEM 
decision  could  be  written  as  a  linear  equation.  In  decision  #1,  one  online 
corps  among  several  candidates  must  be  chosen  by  the  theater  headquarters 
to  receive  a  newly  arrived  corps  In  reserve.  The  factors  used  to  make  this 
decision  are  A,  B,  C  and  0  discussed  In  Table  2  above.  Each  candidate  has 
a  specific  set  of  four  values  associated  with  It.  Each  such  value  corre¬ 
sponds  to  a  particular  level  of  one  of  the  factors  as  discussed  In  Table  3 
above.  For  each  candidate,  the  equation  Y  ■  55  -  36*A  +  18* B  -  C  +  3*D 
Is  evaluated  using  the  four  values  associated  with  It.  The  Y  value  so 
calculated  Is  the  priority  for  the  candidate.  The  candidate  corps  whose 
priority  Is  larger  than  all  of  the  other  candidates  Is  chosen  to  receive 
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the  newly  arrived  corps  In  reserve.  If  two  or  more  candidates  tie  with  the 
largest  priority,  no  decision  can  be  made  based  on  these  factors.  In  this 
case,  each  of  the  four  values  for  these  candidates  would  be  equal.  This 
would  Imply  their  equivalence  In  relation  to  the  four  factors  considered. 


A  more  fruitful  analysis  could  be  obtained  If  the  subjects'  responses 
could  be  transformed  to  a  response  similar  to  the  priority  value  assigned 
to  each  online  corps  by  the  present  FORCEM  algorithm.  One  transformation 
that  showed  definite  promise  was  the  rank  transformation.  The  rank 
transformation  used  consisted  of  ranking  each  subject's  response  from  1  to 
N,  where  N  Is  the  number  of  cells  In  the  particular  design  (N  ■  72  for 
Oeclslon  #1).  The  smallest  criticality  Index,  usually  a  zero,  was  assigned 
the  value  1  and  the  largest  criticality  Index,  say  100,  was  assigned  a  72. 
Where  several  responses  of  the  subject  had  the  same  value  (l.e.,  ties),  the 
average  rank  was  used.  The  rank  transformation  did  not  seem  to  affect  the 
overall  results  obtained  In  the  original  cell  means  model,  and  had  the 
added  advantage  of  being  directly  testable  against  the  present  linear  model 
of  the  FORCEM  algorithm.  Using  the  ranked  responses  of  the  military 
experts  ana  estimating  coefficients  of  the  same  linear  model  of  the  FORCEM 
algorithm,  the  equation  Y  ■  62.4  -  15-A  ♦  3 . 94 • B  -  12 . 1 ♦ C  +  4.27-D  was 
obtained.  However  this  model  lacked  fit  and  a  better  model  was  obtained  by 
adding  terms  related  to  the  significant  cross  products  of  the  cell  mean 
model,  Y  -  50.0  -  9.57-A  +  10.4-B  -  14.2-C  +  17.1-1)  -  6.17-D2  +  0.698-D3  - 
4.28-A-B  +  0.588-C-0.  Testing  the  null  hypothesis  of  no  difference  be¬ 
tween  this  model  and  the  model  of  the  FORCEM  algorithm,  one  obtains  a 
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calculated  F  ratio  of  215.3.  Thli  Is  much  larger  than  F(9,1368,.95)  ■ 

1.92.  This  Implies  that  the  hypothesis  of  no  difference  between  the  models 
must  be  rejected. 
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Decision  4,  Designation  of  Posture  of  Online  Corps,  was  treated 
differently  from  the  other  decisions  because  It  Involves  an  ordered 
categorical  response  variable.  The  response  variable  Is  posture.  The 
subject  was  required  to  choose  the  most  appropriate  posture  for  a  given  set 
of  Input  factors.  The  choice  was  attack,  defend,  delay,  or  withdraw.  The 
factors  gave  a  structure  on  which  to  base  the  experiment;  however,  each 
cell  was  analyzed  Independently  of  the  other  cells.  The  factors  C,  H,  and 
I  are  described  In  Table  2,  and  the  levels  are  shown  In  Table  3.  In 
FORCEM,  a  definite  posture  must  be  assigned  to  a  corps  given  a  set  of 
factor  levels.  A  posture  assignment  Is  unique  for  a  given  set  of  factor 
levels  and  Is  given  to  each  corps  possessing  a  particular  factor-level 
combination  during  a  run  of  FORCEM.  In  the  real  world  posture  assignment 
would  probably  be  stochastic  rather  than  deterministic.  An  approach  to 
dealing  with  this  statistically  Is  to  test  each  cell  with  a  simple 
statistical  hypothesis  test.  For  each  of  the  27  cells,  the  null  hypothesis 
for  the  cell  Is  that  less  than  half  of  the  expert  population  chooses  any 
one  of  the  postures.  The  alternate  hypothesis,  the  statement  desired  for 
the  cell.  Is  that  more  than  half  of  the  expert  population  chooses  one 
common  posture;  l.e.,  a  "majority"  posture.  The  test  takes  the  form  of 
Hq:  p  i  0.5  and  H/\:  p  >0.5.  The  random  variable  X ^  (1  •  1  to  81,  for 
sample  of  81  expert  subjects)  takes  on  the  value  1  when  a  subject  picks  the 
posture  with  largest  number  of  responses  (l.e.,  the  "favored"  posture)  In 
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the  cell  under  consideration;  the  probability  that  X •)  *  1  Is  p.  The  random 
variable  Xi  takes  on  the  value  0  If  the  subject  picks  any  other  posture; 
the  probability  that  Xi  •  0  Is  (1  -  p).  If  there  Is  a  tie  for  the  favored 
posture,  the  test  cannot  logically  result  In  a  rejection  of  the  null 
hypothesis.  Assuming  there  Is  a  favored  posture,  a  test  must  be 
constructed  to  decide  whether  (1)  to  reject  the  null  hypothesis  or  (2)  not 
to  reject  the  null  hypothesis  because  of  Insufficient  evidence  to  the 
contrary.  The  appropriate  distribution  Is  the  distribution  of  the  sum  of 
the  random  variables  Xj.  This  Is  the  binomial  distribution  with  parameters 
N  u  81  and  p  *  0.5.  A  critical  region  must  then  be  determined  for  which 
the  null  hypothesis  Is  rejected  when  In  fact  true  with  no  more  than  a 
stated  probability.  This  probability  Is  referred  to  as  alpha,  the 
significance  level  of  the  test.  For  the  case  under  consideration, 

(N  ■  81),  alpha  ■  0.05,  the  critical  region  corresponds  to  a  count  of 
responses  of  K  *  48.  For  alpha  *  0.01,  K  ■  52,  On  this  basis,  the  count 
for  each  of  the  27  cells  Is  tested  In  the  hope  of  rejecting  the  null 
hypothesis.  Table  8  displays  the  results  of  the  test.  The  favored  posture 
Is  designated  In  the  cell  for  the  given  levels  of  the  factors  C,  H,  and  I, 
The  number  of  subjects  of  the  total  of  81  choosing  the  posture  Is  Indicated 
In  parentheses.  Double  asterisks  (**)  Indicate  that  the  null  hypothesis 
can  be  rejected  at  the  alpha  ■  0. 01-level  of  significance,  and  a  single 
asterisk  (*)  Indicates  that  the  null  hypothesis  can  be  rejected  at  the 
alpha  ■  0.05  level.  For  the  remaining  cells  (those  without  asterisks), 
there  Is  Insufficient  evidence  to  reject  the  null  hypothesis;  Indeed,  as 
noted  In  the  table,  for  some  cells  there  Is  no  favored  posture. 
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1:3 

Delay  (51)* 

Defend  (42) 

Defend  (56)** 

1:1 

Delay  (47) 

Defend  (46) 

Defend  (51)* 

3:1 

Delay  (31)# 

Defend  (39)# 

Attack  (58)** 

efend 

1:3 

Defend  (37)# 

Defend  (72)** 

Defend  (68)** 

1:1 

Defend  (47) 

Defend  (70)** 

Defend  (52)** 

3:1 

Defend  (38)# 

Defend  (53)** 

Attack  (70)** 

ttack 

1:3 

Defend  (59)** 

Defend  (61)** 

Defend  (51)* 

1:1 

Defend  (53)** 

Defend  (49)* 

Attack  (43) 

3:1 

Attack  (52)** 

Attack  (67)** 

Attack  (81)** 

jy: 

** : 

significant  at  alpha  ■  0.01 

★ . 

significant 

at  alpha  =  0.05 

# : 

no  majority  posture 
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6.  SUMMARY.  Concerning  the  seven  experiment?  havlnq  criticality  as 
the  response  variable,  the  smaller  experiments  appeared  more  successful 
than  the  larger  experiments.  Subjects’  responses  suggest  that  the  scale  n 
to  100  is  too  large  a  scale.  Most  subjects  assigned  values  by  10s--10,  20, 
30,  ...;  some  assigned  values  by  Es--5,  10,  15,  ...;  and  very  few  assigned 
by  unity  such  as  17,  43,  or  83.  Large  experiments  may  exceed  the 
differentlallty  of  subjects.  There  was  also  evidence  that  all  subjects 
were  not  on  the  same  scale.  Some  tended  to  use  the  lower  portion,  some  the 
center,  and  some  the  upper  portion  of  the  scale.  Hetrogenelty  was  also  a 
problem.  This  also  seemed  more  severe  with  the  larger  experiments. 

Concerning  the  experiment  with  the  discrete  response,  It  was  net  felt 
that  the  analysis  employed  was  the  most  appropriate.  Time  did  rot  permit 
further  study  and  research  of  the  problem. 

Finally,  If  subjects  employed  are  Indeed  experts,  the  utatinlcal 
methods  of  experimental  design,  ana.ysls  of  variance,  and  regression 
analysis  have  potential  for  verification  of  algorithms  of  simulation 
models. 
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A  number  of  experiments  were  conducted  using  a 
factorial  treatment  design  and  gi'oups  of  2®  (21  in 
one  case )  students  in  an  experimnt.  The  response 
variable  was  a  oritloality  score  (aero  to  1®®) 
exoept  for  one  response  variable.  The  writers  used 
an  effect  by  subjects  interaction  as  an  error  near 
square  for  eaoh  effect.  Why  weren't  town  inter¬ 
actions  with  subjects  pooled  to  increase  degrees  of 
freedom  in  an  error  term?  Why  wasn't  an  analysis 
performed  on  the  eight  decisions  and  eight  group 
sessions  in  Table  7?  Also,  the  regression  model 
timed  needs  more  explanation.  Presumbly,  this  is  a 
main  effects  regression  model  with  the  eight 
decisions  as  the  eight  independent  variables  In  the 
regression  equation.  If  the  interactions  are  small 
oonpared  to  min  effects,  it  would  be  expected  that 
the  egreemnt  between  prmdioted  values  from 
regression  and  cell  means  would  be  good  (see  Table 
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Abstract 

The  Ballisl  ic  Research  Laboratory  (BRL)  conducted  an  interac* 
Vive  Firepower  Control  Experiment  from  2  thru  20  December  1085 
to  ucquire  knowledge  on  how  military  personnel  make  tactical  Sre 
control  decisions  for  field  artillery,  and,  for  the  first  time,  automati* 
cally  collected  data  on  tine  digital  communications  between  the  field 
artillery  battery  Fire  Direction  Center  (FA  btry  FDC)  and  simu¬ 
lated  155.71111  howitzer  firing  units.  This  later  portion  of  the  experi¬ 
ment.  the  Battery  Fire  Direction  Center  (Btry  FDC)  portion,  was 
designed  to  test  3  levels  of  the  number  of  howitzers  per  battery,  3 
levels  of  simultaneous  missions,  and  2  levels  of  fire  mission  control 
ratios  with  each  other.  The  intended  design  was  three  replications 
of  a  3  x  3  x  2  factoiial  with  the  linear  Howitzer  x  Mission  interac¬ 
tion  blocked  by  day.  Vnforoseen  software  problems  precluded  the 
completion  of  th«>  design  fnr  this  controlled  laboratory  experiment. 
As  a  result,  informative  data  was  oDiy  collected  for  twelve  of  the 
eighteen  treatment  combinations  of  a  single  replication.  At  the 
conference,  expert  advice  was  solicited  on  the  appropriate  method 
of  analysis  and  the  appropriate  conclusions  to  draw  from  the 
analysis  on  data  collected  from  this  experiment 


Comments  by  panelists  Drs.  Kaye  Bosford  and  W.  T.  Federer  are  at  the 
end  of  this  a-tlca'i. 
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1.  Introduction 

The  Ballistic  Research  Laboratory  (BRL)  conducted  an  interactive  Firepower  Con¬ 
trol  Experiment  from  2  thru  20  December  1685  to  acquire  knowledge  on  Fire  Direction 
Officers’  (FDOs’)  tactical  fire  control  decisions,  and,  for  the  first  time,  automatically  col¬ 
lected  data  on  the  digital  communications  between  the  field  artillery  battery  Fire  Direc¬ 
tion  Center  (FA  btry  FDC)  and  simulated  155mm  howitzer  firing  units.  The  objectives 
of  this  experiment  were  (1)  to  collect  data  on  the  FDOs’  decisions  on  the  type  and 
volume  of  howitzer  fire  for  selected  targets,  and  (2)  to  characterize  the  net  utilization 
between  the  Battery  Computer  System  (BCS)  at  the  btry  FDC  and  the  Cun  Display 
Units  (GDUs)  at  each  howitzer  of  the  firing  btry.  These  two  objectives  were  achieved 
by  conducting  a  controlled  laboratory  experiment  that  simultaneously  focused  on  these 
two  independent  objectives,  i.o.,  portions,  of  the  Firepower  Control  Experiment. 

II.  Teat  Configuration  and  Design 

To  run  the  portions  together,  the  BRL  integrated  a  commonly  shared  database, 
uniquely  developed  BRL  simulators,  and  a  combination  of  tActical  and  commercial  com¬ 
puter  equipment  interfaced  by  a  BRL  "Bit  Box”,  i.e.,  a  modem  between  GDU  protocol 
and  standard,  commercial  computer  RS232  protocol.  Officers  from  the  U.S,  Army  Field 
Artillery  School,  Fort  Sill,  OK,  participated  as  FDOs  and  BCS  operators  while  BRL's 
interactive  simulators  emulated  forward  observers  (i.e.,  the  Multiple  Forward  Observer 
SCEr.ario  simulator,  MFOSCE),  the  Tactical  Fire  Direction  System  (TACFIRE)  bat¬ 
talion  FDC  (i.e.,  the  Fire  Direction  Simulator,  FDS),  and  the  firing  btry  (i.e.,  the  Gun 
Display  Unit  Simulator,  GUNS1M).  Figure  1  outlines  these  major  components  of  the 
laboratory  setup,  and  Figure  2  depicts  their  field  counterparts. 

Six  different  test  cells  containing  sixty  targets  each  were  developed  from  a  Scenario 
Oriented  Recurring  Evaluation  System  Eurone-I,  Sequence  2A  (SCORES  2A)  division 
siice.  Each  test  ceil  was  developed  to  contain  an  identical  mixture  of  twenty  different 
target  types.  'Flic  sixty  targets  in  each  test  cell  were  randomized,  and  the  six  test  cells 
were  used  to  produce  a  total  of  eighteen  scenario  test  cells.  AJI  targets  in  each  tect  cell 
were  forwarded  to  the  FDO  for  selection  of  a  iypc  and  volume  of  fire.  Twelve  pre¬ 
identified  targets  of  the  sixty  targets  were  sent  to  the  BCS  operator  as  fire  missions,  i.e., 
targets  to  be  fired  on  with  the  specified  type  and  volume  of  fire.  It  was  hypothesized 
that  the  BCS  would  require  an  hour  of  testing  to  tire  all  twelve  fire  missions  and  that  an 
additional  forty-eight  targets  would  be  needed  to  ’’load"  the  FDO  for  an  hour  of  ‘esting. 
In  designing  the  experiment,  it  was  implicitly  assumed  that  the  FDOs’  decisions  on  tar¬ 
gets  being  forwarded  to  the  BCS  for  simulated  firing  would  not  affect  the  btry  FDC 
portion’s  measures  of  pcr'ormance. 

The  factors  for  the  EDO  portion  of  the  experiment  wore  (1)  FDO,  (2)  target  type 
and  subty m»,  (3)  target  :;ize,  (a)  type  of  fire  mission  control,  and  (5)  the  initial  ammuni¬ 
tion  load  (Figure  3),  The  factors  for  the  btry  FDC  portion  of  the  experiment  were  (I) 
the  number  of  simultaneous  fin  missions  at  the  BCS,  (2)  the  number  of  howitzers  in  the 
btry,  and  (3)  the  fim*  mission  control  ratios  (Figure  4).  The  levels  of  each  of  these 
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FIREPOWER  CONTROL  EXPERIMENT 


•  FDO 


3  levels,  i.e.,  3  FDOs 

•  TARGET  TYPE  AND  SUBTYPE 

10  levels,  i.e.,  10  different  target  descriptions 

•  TARGET  SIZE 

2  levels,  i.e.,  2  sizes  per  target  type  and  subtype 


•  TYPE  OF  FIRE  MISSION  CONTROL 

2  levels,  i.e.,  adjust  fire  and  fire-for-effect 

•  INITIAL  AMMUNITION  LOAD 

3  levels,  i.e.,  100%,  80%,  or  60%  of  a  basic  load 


Figure  3.  Factors  and  Levels  for  the  Fire  Direction 
Officer  Portion  of  the  Firepower  Control  Experiment 


NUMBER  OF  HOWITZERS  IN  A  BATTERY 

4  HOWITZERS 
6  HOWITZERS 
8  HOWITZERS 

NUMBER  OF  SIMULTANEOUS  MISSIONS 

1  MISSION 

2  SIMULTANEOUS  MISSIONS 

3  SIMULTANEOUS  MISSIONS 

CONTROL  RATIO  OF  THE  FIRE  MISSIONS 

2  ADJUST  FIRE  :  1  FIRE-FOR-EFFECT 
1  ADJUST  FIRE  :  2  FIRE-FOR-EFFECT 


Figure  4.  Factors  and  Levels  for  the  Battery  Fire  Direction 
Center  Portion  of  the  Firepower  Control  Experiment 


factors  were  selected  as  factors  the  BRL  was  interested  in  testing.  First,  for  example, 
the  BCS  is  only  designed  to  handle  up  to  3  fire  missions  at  time.  Second,  each  BCS 
currently  handles  6  howitzers  in  the  field  and  future  alternative  considerations  may  have 
the  BCS  handle  4  or  &  howitzers.  Third ,  fire  mission  control  refers  to  how  fire  is 
directed  on  the  target.  For  all  Adjust  Fire  (AF)  missions  being  sent  to  the  BCS  opera* 
tor,  a  default  of  two  "adjustments”  (consisting  of  a  total  of  two  High  Explosive  rounds) 
were  fired  to  better  locate  the  target’s  position  before  the  expenditure  of  the  btry  vol¬ 
leys,  i.e.,  the  Fire-for-Effcct  (FFE)  portion  of  the  fire  mission.  In  the  case  of  FFE  mis¬ 
sions,  the  observer  has  accurately  located  the  target,  and  it  is  unnecessary  to  "adjust” 
before  firing  the  btry  volleys. 

During  the  first  week  of  testing,  the  BCS  operator  noticed  anomalous  behavior  of 
the  firing  btry  simulator,  GUNSIM,  in  comparison  to  the  actual  tactical  equipment. 
While  GUNSIM  wua  modified,  the  FDO  portion  of  the  experiment  was  run.  As  a  result, 
these  unexpected  software  problems  precluded  the  completion  of  the  design  for  the  btry 
FDC  portion  of  the  Firepower  Control  Experiment.  The  remainder  of  this  paper  will 
focus  on  the  appropriate  method  of  analysis  for  the  data  collected  and  computed  from 
this  portion  of  the  experiment, 


III.  Battery  FDC  Portion  of  the  Experiment 
1.  Design  Matrix  and  Measure*  of  Performance 

The  intended  design  was  three  replications  of  a  3  x  3  x  2  factorial  with  the  linear 
Howitzer  x  Mission  interaction  blocked  by  day  (Figure  5).  Tho  purpose  was  to  measure 
the  effect  of  these  factors  and  their  interactions  on  the  btry  fire  direction  (FD)  net's 
message  traffic.  Two  different  responses  were  computed  to  measure  message  traffic. 
The  first,  net  utilisation,  is  computed  by  dividing  the  total  transmission  time  by  the 
total  time  required  to  complete  the  simulated  firing  of  the  twelve  fire  missions  associated 
with  a  treatment  combination.  The  significance  of  the  btry  FD  net’s  usage  in  the 
battlefield  is  that  higher  not  usage  increases  the  enemy's  opportunity  to  detect  the  loca¬ 
tions  of  the  btry  FDO  and  the  155mm  howitzers.  Presumably,  detection  would  lead  to 
enemy  destruction  of  these  important  assets.  The  second,  the  average  number  oj  mes¬ 
sages  per  minute,  is  computed  by  dividing  the  total  number  of  messages  for  a  particular 
treatment  by  the  total  time  requited.  This  indicates  the  number  of  times  the  tactical 
equipment  must  be  turned  on  and  off  to  transmit  and  receive  messages.  Both  of  these 
measures  are  important  indicators  of  btry  FD  net  usage  when  radios  (rather  than  wire) 
will  link  the  FA  btry  FDC  and  future  semi-autonomous  howitzer  systems. 

As  previously  mentioned,  mid-experiment  software  problems  did  not  permit  the 
completion  of  this  design.  Subsequently,  data  collected  under  experimentally  controlled 
conditions  was  only  available  for  twelve  of  the  eighteen  treatment  combinations  of  a  sin¬ 
gle  replication  of  this  design,  specifically,  days  2  and  3  of  the  design  matrix  in  Figure  6. 
This  paper  will  focus  on  the  analysis  of  the  average  number  of  messages  per  minute  for 
these  twelve  treatment  combinations. 


Number  of  Howitzers 
Per  Battery 


Number  of  Missions 
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1:2 

2:1 

2:1 

1.2 


2:1 

1:2 

1:2 

2:1 
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2:1 
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Average  Number  of  Messages  Per  Minute 


Six  different  fixed  format  messages  can  be  transmitted  on  the  btry  FD  net,  and 
each  of  these  messages  has  a  different  purpose  and  fixed  message  format.  Message  types 
A,  B,  C  and  D  correspond  to  messages  associated  with  instructions  from  the  BCS  opera¬ 
tor  to  the  btry  personnel  via  the  GDU  located  at  every  howitzer,  and  message  types  E 
and  F  arc  the  messages  associated  with  polling  between  the  BCS  and  GDUs  (Table  1), 
i.e.,  requests  and  responses  for  the  firing  status  of  each  howitzer.  Before  the  body  of 
each  of  these  messages,  a  preamble  (i.e.,  a  continuous  1200  hertz  sine  wave)  is  transmit¬ 
ted  for  a  specific  time  to  allow  the  transmitting  and  receiving  equipment  to  reach 
operating  conditions.  The  minimum  specification  preamble  for  BCS  and  GDU  messages, 
i.e.,  250  milliseconds,  was  used  for  all  message  preambles  on  the  btry  FD  net.1  During 
the  experiment,  all  message  preambles  on  the  btry  FD  net  were  fixed  at  250  mil¬ 
liseconds. 

Table  2  presents  the  average  number  of  messages  per  minute  for  the  twelve  treat¬ 
ment  combinations  from  the  experiment.  From  scanning  this  table,  the  average  number 
of  messages  for  the  treatment  combination  8  howitzers,  3  simultaneous  missions,  and  a 
2:1  AF:FFK  control  ratio  is  low  compared  to  the  surrounding  treatment  combinations. 
A  detailed  investigation  revealed  that  the  busy  BCS  operator  failed  to  act  on  the  first 
several  transmissions  of  a  critical  mission  message,  and  for  another  mission,  the  operator 
sent  an  erroneous  message  creating  approximately  4  minutes  of  net  silence.  The  net 
result,  of  these  notions  that  the  BCS  operator  was  essentially  only  actively  working  with 
2  of  the  3  mission  buffers,  From  this  data,  the  BRL  wanted  to  determine  if  the  number 
of  howitzers,  simultaneous  missions,  control  ratios  or  their  interactions  had  a  significant 
effect  on  this  measure  of  performance,  and  solicited  expert  advice  on  the  following  pro¬ 
posed  method  of  analysis  and  on  suggestions  for  alternative  methods  of  analysis. 


IV.  Analysis  of  Battery  FDC  Portion  of  the  Experiment 
t.  Proponed!  Cell  Mean  Estimation  Procedure! 

The  cell  means  model  equation  for  the  btry  FDC'  experiment  is 


y.jkn  ~  +  M)ij  +  7k  +  (<>7)ik  +  Mjk  +  (ft'Hjk  +  °ijkti 


where 


yjjkn  observation  for  control  ratio  level  i, 

simultaneous  mission  level  j,  howitzer 


‘Kxleraal  late rfice  Specification  tor  '  ompuier  <iun  Direction  CT-|3I7(  t  OYK-29  Part  ol  the  Computer  Syitem.  Cud  Direction 
AN  GYK-29(  )  (V)*  Hailed  Tfcbaologiei  C'urporilion-Nordeu  Divimod  Specification  No  Kl.-C T-2C78D-TK.  31  October  1981 
3  H  3  18  p  Si 
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Table  1.  Message  Format*  and  Message  Lengths 


Message 

Code 


Message 

Transmitted 

by 

Transmitted 

to 

Common  Data 

BCS 

GDI's 

Common  Special 

BCS 

GDIs 

Individual  Gun  Order 

BCS 

GDI' 

Control 

BCS 

GDI's 

Request 

BCS 

GDC 

Response 

GDI' 

BCS 

Length 

(in  characters) 


Table  t.  Average  Number  of  Messages  Per  Minute 
Transmitted  on  the  Battery  Fire  Direction  Net 


NUMBER  OF  HOWITZERS 
PER  BATTERY 


SIMULTANEOUS 

CONTROL 

MI9SlON(9) 

RATIO 

(AFFFE) 

Jil 

I 

lit 

1 

M  i 

1 

1.1 

* 

Sil 

1.1 

_ L 

29  7 1  “ 


M 

N 


level  k,  and  observation  n 
overall  mean 


effect  of  control  ratio  level  i,  i=l,2 

effect  of  simultaneous  mission  level  j, 
j=l,2,3 

effect  of  howitzer  level  k,  k=  1,2,3 
effect  of  blocking 

two-  and  three-way  interactions 

error  for  observation  which  is  distri¬ 
buted  independently  and  normally  with 
mean  0  and  variance  <r2,  i.e.,  N(0,  a 2 ) 

This  model  is  overpararneterized  for  the  btry  FDC  experiment  since  observations  are 
missing  for  six  cells.  It  was  recommended*  that  a  cell  mean  estimation  procedure  using 
the  basic  linear  model  could  be  employed  to  estimate  the  six  missing  cells  2 

Using  this  procedure,  estimates  for  the  missing  cell  means  can  only  be  made  if  the 
model  is  constrained  by  assuming  one  or  more  interactions  are  zero.  The  application  of 
constraints,  however,  may  not  relate  all  missing  cell  means  to  the  other  observed  cell 
means,  and  will  yield  one  of  two  types  of  models:  (l)  connected  models  where  all  means 
of  the  missing  cells  are  linearly  estimable  and  any  linear  hypothesis  on  the  cell  means 
can  be  tested;  and  (2)  unconnected  models  where  not  all  missing  cell  means  are  estim¬ 
able  and  the  hypotheses  of  interest  may  still  be  tested.  If  one  can  justify  the  constraints 
necessary  to  produce  a  connected  design  and  the  constraints  are  valid,  then  stronger 
conclusions  can  be  drawn;  however,  constraints  should  not  be  applied  to  just  produce  a 
connected  design.  Hocking  also  notes  that  there  are  varying  degrees  of  connectedness, 
and  the  application  of  additional  constraints  increases  the  precision  of  missing  cell  mean 
estimates. 

For  the  btry  FDC  experiment,  the  first  reasonable  constraint  would  be  to  assume 
that  there  is  no  three-way  interaction  Howitzer  x  Mission  x  Control  Ratio,  i.e., 
(or#7)1jk  ■■  0.  Based  on  this  assumption,  the  missing  cell  means,  /j^s,  can  be  estimated 
by  the  following  equation: 


<*i 

% 

(^)jlt 

Mij.  (<nhk> 

(a^)i,k 

eijkn 


*  lock  O.  Grynovicki,  U  S  Army  Ballutic  R  "March  Laboratory.  Aberdeen  Provmf  Ground,  KID 

2 

Hockinj,  Ronald  R  ,  Tht  Antlyi i*  0/  Lmtor  Modili,  Monterey.  CA  Brooka/Cnle  Publubinf  Company,  1 086 


Pijk  -  Pi'  jk  -  Pij'  k  +  Pi'  j'  k  =  Pijk'  -  Pi'  jk'  -  Pij'  k'  +  Pi'  j'  It* 


where 


U'  = 


J.J 

k,k' 


t  _ 


1,2:  i  76  i' 
1,2,3;  j  ^  j' 
=  1,2,3;  «c  k' 


for  the  control  ratio  levels 

for  the  simultaneous  mission  levels 

for  the  howitzer  levels 


However,  this  constraint  yields  an  unconnected  model  with  none  of  the  six  missing  cells 
being  linearly  estimable.  Based  on  the  design  assumptions,  anothor  reasonable  assump¬ 
tion  would  be  that  there  is  no  Howitzer  x  Mission  interaction,  i.e.,  (0*t)fr  »  0,  in  addi¬ 
tion  to  no  Howitzer  x  Mission  X  Control  Ratio  interaction.  Based  on  these  two  assump¬ 
tions,  the  missing  ceil  means  can  be  estimated  by  the  following  formula: 

Pijk  "  Pij'  It  ~  Pijk'  “  Pij'  k' 

These  constraints  provide  estimates  for  (he  8  missing  cell  means,  and  the  associat''d  sin¬ 
gle  effective  constraint  is 

Pm  -  Pil3  "  Pl22  +  Pm  •  P 131  +  Pl32  —  0  • 

Using  these  constraints,  the  missing  cell  meaDs  can  be  related  to  the  observed  cell  means 
as  follows: 

/z  j  j  j  =  /i||3  -  P123  *b  p  i22  =  28.80  -  38.10  *4"  34.09  “  25.63  , 

P212  =  P213  —  P223  "b  P222  ==  25.89  -  40.93  -I*  40.98  —  25.92  , 

P121  ~  Pl23  “  Pti3  *b  Pin  =  36.10  -  28.80  -r  24.51  =  33.81  , 

P221  ==  P223  —  P213  "b  P 2 1 1  =  40.93  -  25.89  -6  25.06  ==  40.10  , 

P 133  =  Pi3t  ~  Pm  *b  =  41.19  -  24.51  +  26.80  =  -13.48  , 

P233  ==  P231  ”  Poll  "b  P;;3  =  42.73  —  25.06  -4*  25.89  =  43.56  . 


Table  3  pro\,d*s  the  estimates  for  the  6  missing  treatment  combinations  along 
with  the  12  treatment  combinations  from  the  experiment.  By  using  the  values  in  this 
table,  an  analysis  of  variance  (ANOVA)  wac  performed  and  is  provided  in  Table  4. 
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Table  4.  ANOVA  on  the  Effect  of  the  Factors 
on  the  Average  Number  of  Messages  Per  Minute 


SOURCE 


Howitzers 


Missions 


Control  Ratio 


Howitzers  X  Control  Ratio 


Missions  X  Control  Ratio 


Pooled  Error 


DEGREES  OF 

SUM  OF 

MEAN 

FREEDOM 

SQUARES 

SQUARE 

2 

22.8550 

11.4280 

2 

760.0711 

380.0356 

1 

55.6864 

55.6864 

2 

21.0085 

10.5403 

2 

20.4570 

14.7285 

2 

,’1.4755 

35.7378 

060.0444 


2, 2,0-0.05  ~ 
1,2,0-0.06  =  200 


«1 
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One  should  note  that  the  Howitzer  x  Mission  x  Control  Ratio  and  the  Howitzer  x  Mis¬ 
sion  interactions  were  pooled  for  error  since  it  was  assumed  that  these  interactions  were 
not  significant  in  the  cell  mean  estimation  procedure.  From  Table  4,  one  concludes 
that  none  of  the  main  effects  or  twoway  interactions  are  significant  at  a  =  0.05  based 
on  the  assumptions  of  no  Howitzer  x  Mission  x  Control  Ratio  and  Howitzer  x  Mission 
interactions.  If  either  of  these  assumptions  are  incorrect  then  the  pooled  error  is  biased, 
using  a  biased  error  value  lowers  the  F  ratios  and  can  result  in  factors  or  their  interac¬ 
tions  being  statistically  insignificant. 

A  consequence  of  using  this  cell  mean  estimation  procedure  is  that  one-third  of  an 
unreplicated  design  was  estimated  based  on  two  assumptions.  The  resulting  ANOVA 
failed  to  detect  any  significant  main  effects  or  interactions  despite  seemingly  differences 
between  certain  levels  of  the  factors.  Additional  experimentation  would  be  required  to 
test  if  the  assumptions  associated  with  the  cell  mean  estimation  procedure  were  justified 
and  more  confidently  determine  the  conclusions  of  no  significant  main  effects  or  interac¬ 
tions.  In  lieu  of  additional  testing  and  this  cell  mean  estimation  procedure,  the  panel 
recommended  paired  t  tests. 

9.  Suggested i  Paired  t  Tests 

Based  on  the  panel’s  suggestions,  paired  t  tests  were  performed  on  the  data  to  test 
for  a  significant  difference  between  the  means,  i.e.,  the  average  number  of  messages,  of 
the  levels  of  each  factor  assuming  no  interactions.  The  null  hypothesis,  Ho,  for  each  test 
was  that  there  was  no  difference  between  the  means  of  two  levels  of  a  factor  versus  the 
alternative  hypothesis,  Hj,  that  the  mean  for  a  given  level  exceeded  another.  This  one 
sided  alternative  hypothesis  was  not  rejected  only  if  the  difference  between  the  means 
was  significantly  greater  than  zero. 

An  overall  paired  t  test  was  computed  for  the  difference  between  the  1:2  and  2.1 
AF:FFE  control  ratio  levels  under  the  same  howitzer  and  mission  levels,  i.e.,  8  paired 
differences.  H0  was  not  rejected  since  the  computed  t  statistic  at  a  significance  level  of 
a  =  0.05  was  close  to  but  did  not  exceed  the  tabled  t  value  t5  dr=  2.015.  This  was  a 
bit  surprising  since  only  "one  GDU’s  worth"  of  messages  are  requested  and  transmitted 
for  each  "adjustment”,  and  each  "adjustment"  requires  "one  round's  worth”  of  time. 
Thus,  one  would  expect  the  average  number  of  messages  per  minute  foi  a  2:i  AF;FFE 
control  ratio  to  be  lower  than  a  1:2  AF  :FFE  control  ratio. 

In  addition  to  this  paired  test,  two  other  paired  t  tests  were  computed;  one  with 
the  pairs  by  howitzer  level  and  the  other  with  the  pairs  by  mission  level.  In  computing 
these  tests,  two  difference  pairs  were  obtained  for  each  howitzer  and  mission  level  by 
computing  the  difference  across  a  specific  control  ratio  level.  H0  was  rejected  if  the  com¬ 
puted  t  statistic  exceeded  the  tabled  t  value  at  a  significance  level  of  a  =  0.05,  i.e., 
t1(if=  8.311  Only  two  of  the  six  null  hypotheses  could  not  be  accepted  ai  a 
significance  level  of  a  —  0.05.  First,  H0  was  not  accepted  between  1  and  3  simultaneous 
missions  for  4  howitzers.  This  supports  the  expectatiou  that  as  the  number  of  simultane¬ 
ous  missions  increases  more  missions  are  handled  in  a  shorter  time,  i.e.,  the  average 
number  of  messages  per  minute  increases.  However,  no  significant  difference  was 


detected  at  a  =  0.05  between  2  and  3  simultaneous  missions  for  3  howitzers,  or  between 
1  and  2  simultaneous  missions  for  8  howitzers.  Second,  H®  was  not  accepted  between  8 
and  8  howitzers  handling  2  simultaneous  missions.  This  also  supports  the  expectation 
that  as  the  number  of  howitzers  increases  more  howitzers  are  sending  and  receiving  mes¬ 
sages  in  essentially  the  same  amount  of  time,  i.e.,  increasing  the  average  number  of  mes¬ 
sages  per  minute.  Similarly,  no  significant  difference  was  detected  at  Of  ns  0.05  between 
4  and  8  howitzers  handling  1  mission,  or  between  4  and  6  howitzers  handling  3  simul¬ 
taneous  missions. 


V.  Conclusions 

The  data  collected  from  the  btry  FDC  portion  of  the  Firepower  Control  Experi¬ 
ment  suggests  that  different  procedures  should  be  considered  to  reduce  btry  FD  net 
usage  when  radios  will  link  the  FA  btry  FDC  and  future  semi-autonomous  howitzer  sys¬ 
tems.  The  pai-ed  t  tests  on  the  average  number  of  messages  per  minute  detected  a 
significant  difference  at  a  —■  0,05  between  1  and  3  simultaneous  missions  for  4 
howitzers,  and  6  and  8  howitzers  handling  2  simultaneous  missions.  Although  this  sup¬ 
ports  our  initial  design  assumptions  that  the  number  of  howitzers  and  simultaneous  mis¬ 
sions  significantly  affect  the  usage  of  the  btry  FD  net,  it  also  clearly  points  out  that 
completing  the  intended  design  could  have  produced  more  confident  conclusions. 
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COMMENTS  BY  PANELISTS  OR.  KAYE  BASFORD  AND  PROFESSOR  W.  T.  FEDERER 

ON  THE  FOLLOWING  ARTICAL 

Anal y»i»  of  an  Incomplete  block  design  of  experiments  by 
Uendy  A.  Winner 
and  Jill  H.  Smith 

U.8.  Army  Ballistic  Ra March  Laboratory 

Bay*  Bastard:  BacauM  full  data  ware  not  oollactad  on  tha  original 
designed  experiment,  I  siqgest  that  it  ba  analysed  in 
a  Rush  simpler  way.  For  instance,  simple  t  taata  or 
non-par  an*  trie  teats  oould  ba  uaad  to  oonpw*  fire 
mi sa ion  oontrol  ratios  ouvr  all  howitzer  and  mission 
levels.  Although  not  giving  the  detail  of  the  planned 
analysis,  it  should  allow  son*  information  to  ba 
obtained  from  tha  data  oollactad. 

W.T.  Fadarari  The  resulting  design  is  a  two-thirds  fraction  of  a 
2X3*  factorial  of  the  following  nature: 
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1  2 
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bi 

1  2 
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X 

y 

6 

X 

X 

X 

X 

8 

X  X 

X  X 

where  x  denotes  combination  present  and  blank  denotes 
oorrbinat  Ion  absent.  In  the  above?  fraction  main 
affect*  will  be  estimable  at  well  as  12-( i* 1*2+2 )  »  6 
degrees  of  freedom  for  interactions.  These  6  degrees 
of  freedom  are  A  x  B  (2  d.f.),  AxC  (2  d.f.) 
B  (linear)  x  C  (linear)  (1  d.f.),  and  A  x  B  (linear) 
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x  C  (linear)  (1  d.f.).  Unless  It  were  known  that  one 
or  more  of  the  degrees  of  freedom  for  inter sot  ion 
represented  experinrnntsl  error,  no  error  maan  square 
would  be  available  for  testing  the  significance  of 
the  effects.  For  no  available  error  seen  square,  it 
Is  suggested  that  use  be  node  of  Cuthbert  Daniel's 
half  normal  plot  procedure  to  aaoertain  whioh  of  the 
elewn  treatment  sinple  degrees  of  freedom  sums  of 
squares  were  alike  and  tfiich  were  different.  If  the 
smaller  oontrasts  responding  similarly  oould  be 
considered  as  possible  candidates  for  no  treatment 
ef foots |  then  an  error  term  oan  be  obtained  using 
Cuthbert  Daniel's  procedure  (see  e.g.  8. A.  Hrane 
(1963)  "Half  normal  plots  for  nulti- level  factorial 
experiments" ,  Proo.  Eighth  Con/ .  Design  Expt.  Arny 
Res.  Dev.  testing,  pp  261-265). 


A  HEURISTIC  APPROACH  TO  POST  -  HOC  COMPARISONS  FOR  SIGNIFICANT 
INTERACTIONS  -  A  SIMPLIFIED  NOTATION 

Eugene  Dutolt 
U.S.  Army  Infantry  School 
Fort  Banning,  Georgia 

ABSTRACT: 

The  omnibus  F  ratio  test  used  In  analysis  of  variance  Is  used  to 
determine  If  any  of  the  main  or  Interaction  effects  are  statistically 
significant.  Customarily,  various  techniques  are  used  for  performing 
post-hoc  comparisons  on  the  statistically  significant  main  effects.  The 
purpose  of  this  paper  will  be  to  present  a  heuristic  approach  for  post-hoc 
procedures  on  the  significant  Interaction  effects.  These  procedures  will 
use  the  conventional  graphical  methods  to  show  the  overall  interaction 
effect  and  then  apply  conservative  methods  to  detect  the  significant 
components  of  the  overall  Interaction.  The  paper  will  develop  graphical 
and  notatlonal  method  for  decomposing  a  complex  Interaction  Into  its 
significant  components  for  further  analysis.  Examples  will  be  given  for  a 
two-way  design  with  variables  at  two  and  more  levels. 

ACKNOWLEDGEMENT :  The  author  wishes  to  thank  Dr.  John  Tukey  for  his 
suggestion  to  use  a  Bonferronl  contrast  In  addition  to  the  Scheffe 
method.  The  Bonferronl  method  will  be  calculated  for  each  of  the  examples 
In  this  paper  and  the  results  compared  with  those  obtained  by  using 
Scheffe  contrasts. 


SECTION  1  (A  TWO-WAY  ANOVA  PROBLEM  ) 
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Consider  the  following  twc  way  ANOVA  problem  obtained  from  Ostlc.  The 
dependent  variable  Is  the  yield  In  soy  beans  (busheH/acre).  The  raw  data 
and  the  resulting  ANOVA  are  presented  In  the  table  below. 
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TABLE  1 
TWO  WAY  ANOVA 


Date  of  Planting 


Fertll Izer 


EARLY 

- — - ^ 

LATE 

Cl 

Aero 

Na 

B 

Cl 

Aero 

Na 

K 

29 

29 

28 

29 

30 

33 

30 

33 

37 

29 

27 

26 

32 

31 

33 

32 

33 

31 

26 

26 

32 

31 

33 

32 

33 

29 

29 

32 

31 

34 

34 

29 

33 

29.5 

27.5 

29.25 

31.25 

32.25 

32.5 

31.5 

Source 

DF 

SS 

MS 

F 

Day  of  Planting 

1 

34.031 

34.031 

10.44 

*  ( SI  g ) 

Fertilizer 

3 

20.594 

6.865 

2.11 

(Not  Slg ) 

Interaction 

3 

47.344 

15.781 

4.84 

**(S1g) 

Error 

24 

78.250 

3.260 

*  FI,  24  (.05)  -  4.26 

*♦  F3,  24  (.05)  ■  3.01 _ 

The  usual  Scheffe  contrast  (  Ij,  )  Is  formed: 

L  ■  Z  A1 XI ,  where  z  A1  •  0. 

The  critical  difference  (CD)  Is  calculated  as 
C  D  ■  (S)(SE.),  where 

S  ■  [(number  of  treatment  levels  -3)  F  (critical,  a 


§? 


For  a  simple  pairwise  contrast  between  two  means: 


St* 


(2)  (MSerror) 


1/2 


(4) 


N  group 


The  contrast  Is  statistically  significant  If 
1*1  >  CO 


(5) 


The  above  procedure  Is  applied  to  the  data  In  this  2  way  ANDVA  against  the 
main  effects  of  day  of  planting  and  fertilizer  type* 

Day  of  Planting  Effect: 

Average  yield  for  early  planting  (tarly)  •  29.81  bushels/acre 
Average  yield  for  late  planting  (Late)  ■  31.86  bushols/acre 


The  contrast  (  l  )  Is: 


'late  “  xearly 


■  31.88  «  29.81  ■  2.07  bushels/acre 


S  •  [(Hreat  levels  -  1)  F  (critical)]^  •  (<1)(4.26)]*».  •  2.06 

,1/2 

•  .64 


SE,  - 

‘(?)<MS,rror)  ' 

1/2 

■ 

721(3.26)  "l1 

N  group 

is 

CD  ■  ( 

S )  ( SEg)  •  1.32 

Since 

(|*|  -  2.07}  > 

(CO 

•  1.32  } 

;  the  contrast  Is  significant  at 
the  5t  level  of  significance.  Of  course,  the  ANOVA  table  results  already  Indicated 
this  effect.  The  Scheffe  calculation  was  presented  to  lllustrate/review  the 
procedure. 
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Fertilizer  Effect 


Average  yield  for  chlorine  (Cl)  ■  32.125  bushels/acre 

Average  yield  for  aero  (Aero)  *  30.875  bushels/acre 

Average  yield  for  sodium  (Na)  ■  30.000  bushels/acre 

Average  yield  for  potassium  (K)  »  30.375  bushels/acre 

There  are  (4C2)'  6  possible  pairwise  contrasts.  These  are  given  In  the 
table  below: 


Table  2 

Pairwise  differences  |fl|  for  four  Fertilizers 
(bushels  /  acres) 


CD  •  (  S  )  (SE*)  -  2.705 

Note  that  no  value  of  !C(  In  table  2  above  Is  greater  than  the  CD.  The  ANOVA 
table  furnished  the  same  Information  as  the  above  calculation.  Now  let  us  examine 
the  significant  Interaction  effect  as  shown  In  table  1. 


Interaction  Effect 

This  section  will  develop  a  way  to  examine  the  Interaction  effects.  Consider 
the  diagram  below  (Figure  1).  In  this  example,  there  are  two  factors  A  and  B,  each 
factor  at  two  levels.  The  parallel  lines  Indicate  there  Is  no  Interaction.  The 
total  Interaction  can  be  decomposed  Into  two  seperate  graphs  for  each  level  of 
factor  B.  The  decomposition  Just  makes  It  visually  easier  to  calculate  the  slopes 
for  each  of  the  two  lines. 

Figure  1 

Interaction  Decomposition  i 


Response 


(decomposition) 


<s ,*)  * 


/  :  I  */ 

- 1  |  °'4' 

C  (VO  * 


Level  Level  1  A  2  I  1 

,  imiM? , 

(No  Interaction) 

Paying  attention  to  the  right  side  of  the  arrow  In  the  above  figure,  the  slopes 
for  ana  B2  respectively  are: 


£>  Y  5-2 

Slop*  .  ■ 

or  alternatively 

Slope  ■  A  -  C  •  D  -  B 
This  expression  can  be  written  as 
(A  +  B)  -  (C  +  D)  -  0 
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This  Identity  forms  the  basis  for  writing  the  contrast  (  $  ).  If  no  Interaction 
exists,  then  the  contrast  can  be  written  as: 

$  >  (A  +  B)  -  (C  +  D)  -  0  (6) 

This  contrast  win  be  used  to  examine  the  significant  Interaction  term  In  Table  1. 
The  arithmetic  means  for  the  day  of  planting  and  fertilizer  Interactions  are  given  below 
In  Table  2. 


Table  2 


.Interaction  Data, 


Early  Cl 

•  33 

Late  Cl  ■ 

31.25 

Early  Aero 

-  29.5 

Late  Aero  ■ 

32.25 

Early  Na 

■  27.5 

Late  Na  - 

32.50 

Early  K 

-  29.25 

Late  K  ■ 

31.50 

The  above 

Interaction  effect  can  be  nlotted  in  n«n«i 

UB  U  Tk  4  ft  4ft  eikAiuA 

Examination  of  figure  2  suggests  that  the  significant  Interaction  shown  In  Table 
1  Is  driven  by  the  effect  of  the  chlorine  fertilizer  as  It  Interacts  with  the  other 
three  fertilizers.  Figure  3  below  gives  the  interaction  decomposition  (using  the 
notation  of  Figure  1)  of  the  Na,  Cl  component  of  the  total  Interaction. 


Figure  3 

The  Na,  Cl  Component 


Yield 


Early 


c  'll 

1 

D-3i'5- 

M  1 

/ 

SiSF  1 

1 

I 

B  * 

1 

- * - L 

Early 


Early 


Using  equation  (6),  the  Interaction  component  contrast  can  be  calculated: 

'I'  *  (XA+  ?g)  -  (^£  +  Yp) 

¥  •  (31.25  +  27.5)  -  (33  +  32.5) 

|$l  -  6.75 

In  order  to  determine  If  this  component  of  the  total  Interaction  Is 
statistically  significant  (Is  ly|  significantly  greater  than  zero?),  the  Scheffe 
critical  difference  (CD)  will  be  calculated  using  the  methods  reflected  In  equations 
(2)  through  (5). 


g*  ww»  wm  A 


”w  ♦  ”  ,**'E‘W"A  T  ^n3  +  +  * 

In  this  case  all  n$  are  equal. 

SE2j  -  MSE(4/nJ 

Therefore 

SE^  *  (3.26)  (4/4) 

SE-  •  1.81  . 

The  Other  Component: 


s’ .  (dft>  lif,^  r(af^  ,  ,df  ,  ,  (df,„,v) 


df  for  Interaction 


critical  value 


2 

s  “  Interaction^  f  (critical,. 


In  this  case 


S2  •  (3)  F  3>  24  (>05)  ■  (3)  (3.01)  «  9.03 

S  •  3.00 


The  Critical  Difference 


CD  *  (SE;  )  (S)  •  (1.81)  (3)  »  6.43 


Since  (111  ■  6.75)  >  {CD  «  5.43)  the  d«y  of  planting,  Wa/Cl  Interaction 
component  of  the  total  Interaction  Is  significant. 

Figure  4  shows  the  Interact!  t  decomposition  of  the  Ne ,  Aero  component  of  *he 
total  Interaction. 


s  .-.y. 


Figure  4 


Early  Late  Early  Late  1  Early  Late 


The  lines  are  not  exactly  parallel  but  are  the  differences  In  slopes 
statistically  significant? 

The  contrast  Is: 

&  •  (XA  ♦  XB)  -  (XC  +  XD)  «  (32.5  +  29.5)  -  (27.5  +  32.25) 

$  -  2.25 

The  value  for  the  critical  difference  (CO)  Is  still  5.43. 

Since  U  ■  2.25)  <  (CD  ■  5.43)  ;  the  day  of  planting,  Na/Aero  component  of  the 
total  Interaction  Is  not  significant. 

In  this  example  problem  there  are  (4C2)  or  6  pairwise  components  that  make 
up  the  total  Interaction.  The  results  of  these  six  components  are  summarized  In 
Table  3. 


Table  3 


Component  Summary 

a  ■  ,05 

CD  •  5.43 

Component 

lil 

Resul ts 

Na/C1 

6.75 

sig 

Na/Aero 

2.25 

NS 

Na/K 

2.75 

NS 

Cl /Aero 

4.50 

NS 

Cl  /K 

4.00 

NS 

Aero/K 

.5 

NS 

It  should  be  noted  that  although  only  one  of  the  components  of  the  total 
Interaction  was  found  to  be  statistically  significant  (a  ■  .05),  the  chlorine 
fertilizer  effect  was  Involved  with  the  largest  values  of  $  . 

The  results  of  the  Bonferronl  method  will  now  be  compared  to  the  results 
obtained  from  the  Scheffe  method  used  so  far  In  this  paper.  The  values  for  the 
contrast  (  *  )  and  the  SE  are  calculated  the  same  way  as  for  the  Scheffe  methods. 
TheU|  Is  significant  If: 


1*1  >  (ta/2p  ,  vl(SEj)  (7) 

where:  p  Is  the  number  of  components  or  contrasts  examined  In  the  total 
Interaction. 


v  Is  the  degrees  of  freedom  for  the  error  term. 


t  Jr>  Is  then  obtained  from  tables  of  the  critical  values  for  the 

a/2p 

Bonferronl  t  (Mllllken  and  Johnson). 

The  Bonferronl  critical  difference  (BCD)  Is  calculated  as: 


BCD  -  (t  /9n  ,  v) (SE*)  . 

a/ dp  V 


(7A) 


In  this  example: 

a  ■  .05 

p  *  6  possible  contrasts 
v  ■  24 

therefore  ^.o5/2p  »  v  "  ■  2.88 

and  SEj*  ■  1.81. 

therefore  the  BCD  ■  5.21. 

Referring  to  Table  3,  It  Is  apparent  that  this  41  decrease  In  critical 
difference  (5.43  versus  5.21)  does  not  change  the  decision  regarding  the  significant 
component  of  the  total  Interaction  for  this  particular  example. 

Section  2  (A  Three-Way  ANOVA  Problem). 

The  following  example  will  expand  the  discussion  of  section  1  to  a  three  way 
ANOVA.  The  dependent  variable  Is  the  time  (seconds)  required  by  a  blind  rat  to  run 
a  maze.  The  Independent  variables  are: 


1)  When  the  rat  was  blinded  (early  or  late  In  life). 

2)  Intelligence  (bright,  mixed,  dull). 

3)  Movement  (free  (F)  or  restrained  (R)) 

The  data  and  the  resulting  ANOYA  tables  are  shown  below: 


Table  4 

Three  Way  ANOVA 


Early  Blinded  1 

Late  Blinded  i 

Bright 

Mixed 

Oul  1 

Bright 

Mixed 

Dull 

F  R 

F  R 

F  R 

F  R 

F  R 

F  R 

27  55 

130  140 

55  132 

90  105 

61  65 

140  142 

45  81 

120  150 

76  96 

120  110 

82  80 

99  96 

7-  36  68 

125  145 

65.5  114 

105  107.5 

71.5  72.5 

119.5  119 

Source 

df 

MS 

F 

Time  of  Blindness  (B) 

1 

287.04 

.83  NS 

Intelligence  (I) 

2 

1652.05 

4.76  (Slg  5*) 

Environment  (E) 

1 

1785.38 

5.15  (Slg  5%) 

B  x  I 

2 

7638.79 

22.02  (Slg  5*)* 

B  x  E 

1 

1584.37 

4.57  NS 

I  x  E 

2 

91.12 

.26  NS 

B  x  I  x  E 

2 

115.88 

.33  NS 

Error 

12 

346.88 

23 

f2,  23( .05) 

■  3.89 

At  this  point  only  the  significant  Interaction  (time  of  blindness,  Intelligence) 
will  be  examined  In  detail.  The  data  for  this  particular  Interaction  are  given  In  Table 
5. 

Table  5  _ 


Interaction  Data  | 

(Arithmetic  Means)  ' 

(Dependent  variable;  time  In  seconds)  ! 


Early  Blind;  Bright  ■  52  sec 

Early  Blind;  Mixed  ■  135  sec 

Early  Blind;  Dull  -  89.75  sec 


Late  Blind;  Bright 
Late  Blind;  Mixed 
Late  Blind;  Dull 


106.25  sec 
72  sec 

119.25  sec 


The  plot  of  the  Interaction 

Is  given  as  figure  5. 

Aug 

150 

Figure  5 

The  Total  Interaction 

Response 

125 

/  \  LATB  BWWb 

Time 

(sec) 

100 

^  -s,/  X  \ 

/  s  ^ 

Intelligence 


The  total  interaction  will  be  decomposed  in  the  same  manner  as  shown  in  section  1. 
Thereare^Cg)  or  3  components  to  examine.  The  components  of  ^Bright  to  Mixed 
Intelligence]  end  of  {Mixed  to  Dull  Intelligence]  appear  to  be  significant.  The  third 
component  {Bright  to  Dull]  is  probably  not  significant. 

Figure  6  shows  the  bright  to  mixed  intelligence  component. 


a'*  t. 

W 

i'.v, 


Intelligence;  Bright  to  Mixed 


Label  points 
accordi n:  to 
previous 
convention 


Wk 
* 
1$;* 


(Intelligence) 


By  labeling  the  points  according  to  the  previous  convention  and  using  equation 
(6),  the  Interaction  component  contrast  can  be  calculated: 

I  »  <XA  +7b)  -  (Xc  +  7d)  -  (135  +  106.25)  -  (52  +  72) 

I  ■  117.25 


In  this  case 
ci 


or  SE2^  •  MS^(4/n)  which  Is  the  same  as  In  section  1.  The  number  of 
observations  for  each  cell  (n)  Is  4  therefore: 


^n1 


MSE(4/n) 


»,•** Vv  *'■ 


SE2;  •  346.88(V*)  ■  346.88 

SEa  -  18.62 

♦ 

The  other  component  (S*)  1$: 

S2  *  «f1nt.r.ct1or>>  F.  *  <*>  U-WI  * 

S  •  2.79 

The  (CD)  is: 

CD  -  (S)  (SEj  )  •  51.95 

since  I*  117.25}  >  ^CO  ■  5.95},  this  component  of  the  Interactl 
[Intelligence;  bright  to  mixed]  Is  statistically  significant. 

Figure  7  gives  the  mixed  to  dull  Intelligence  component. 


Figure  7 

Intelligence;  mixed  to  dull 


The  contrast  for  this  component  Is: 

If  (*A  +  XB)  -  (XC  4  XD)  -  (89.75  4  72)  -  (135  4  119.25) 
i  -  -92.5 

The  CD  Is  still  equal  to  51.95 

S1ncejl$>  I  •  92.5}  >  (CD  ■  51.95},  the  mixed  to  dull  component  of  the  Interaction  Is 
statistically  significant.  This  was  expected. 

Finally,  Figure  8  shows  the  bright  to  dull  component. 


Figure  8 

Intelligence;  Bright  to  Dull 


Note  that  t  J  ■  24.75}  <  {co  *  51.95},  therefore  this  component  of  the  Interaction 
Is  not  statistically  significant. 

In  this  problem,  2  out  of  3  total  pairwise  Interaction  components  were 
significant  at  a  5%  level  of  significance. 

The  8onferron1  method  will  be  applied  to  this  problem.  SE  Is  the  same  value 
(18.62),  P  Is  equal  to  3  and  n  Is  12.  Therefore  the  Bonferronl  t  (o  ■  .05)  Is  equal 
to  2.78.  The  Bonferronl  critical  difference  (BCD)  Is  (18.62)  (2.78)  or  51.76. 

This  Is  only  slightly  less  than  the  Scheffe  CD  of  51.95.  There  are  no  differences 
In  the  decision  regarding  significant  components  between  the  two  methods  for  this 
particular  example. 


Section  3  (Interaction  Where  Both  Factors  (F,  G)  Have  More  Than  Two  Levels) 
Consider  the  case  where  factor  F  has  three  equally  spaced  levels  {fp  f2. 
f3|and  factor  G  has  levels  |'g1,  g2,  93 ^ .  Consider  Figure  9  below  which 
shows  the  decomposition  of  the  total  Interaction  In  the  Interval  for  factor  F  In 

ft.  f*V 


Figure  9 


Two  Factors  at  Three  Levels 
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Given  that  no  Interaction  exists;  Is  the  "chaining  notation"  discussed  In 
sections  1  and  2  of  this  paper  true  In  this  situation?  Following  the 
notation/convention  discussed  earlier: 

$  ■  (Xb  +  xc>  -  <Xd  +  X.)  ♦  (Xd  ♦  xe)  -  (Xf  ♦  xc)  •  0  (8) 

The  demonstration  Is  simple.  Given  that  no  Interaction  exists,  then  (Xb  -  Xa) 
<*d  ■  *c>  •  <*f  *  *e>' 

Equation  (8)  can  bo  written  without  brackets: 

f  -  h  +  *c  -  *d  ■  *a  +  *d  +  *e  ‘  *f  "  h 
Re-arranging  terms: 

$  •  Xb  -  xa  -  xd  4  sc  ♦  xd  -  xc  -  xf  ♦  ye 

Inserting  brackets 

$  ‘  (xb  -  Xa)  -  (Xd  -  7C)  ♦  (Xd  -  7C)  -  (Xf  -  7e) 

Since  all  terms  In  brackets  are  equal  to  each  other.  It  follows  that: 

J  •  0  If  no  Interaction  Is  present. 

Note  that,  the  E  a^  ■  0  for  the  contrast. 

In  this  case: 

$E2j  •  HSEtaf/n,  . 

If  r\*  are  equal  to  n,  then 


nOi 

•flVi 

S 

0 
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1  *.*■ 
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$0 
*0. 

$ 

m 
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— n - 

In  the  case  of  the  Scheffe  methodology: 

S2  *  (^Interaction)  ■  (2  x  2)  F  ■  4F 
or, 

S  -  2^” 

These  values  for  (SE  ,  S)  could  be  the  same  for  examining  Interaction  components 
In  the  Interval  •{ f2.  f 3 }  and  jfj,  f 3 J. 

Summary 

This  notation  (chaining)  can  be  applied  for  two  factor  Interactions  where  the 
factors  are  at  any  number  of  levels.  Each  linear  component  of  the  total  Interaction 
can  be  examined  to  determine  which  component(s)  contributed  to  the  overall 
significant  Interaction.  As  the  examples  presented  In  this  paper  show,  a 
statistically  significant  Interaction  (per  the  omnibus  F  test)  does  not  Imply  that 
all  components  of  the  total  Interaction  are  statistically  significant.  The  post-hoc 
analysis  of  the  Interaction  should  lead  to  Improved  Insights  about  the  data  just  as 
these  methods  aid  In  the  analysis  of  the  main  effects.  Both  Scheffe  and  Bonferronl 
methods  were  applied  to  the  example  data.  No  differences  were  made  In  the  decision 
concerning  which  components  of  the  Interaction  were  significant  and  the  differences 
between  the  "critical  differences"  were  small.  It  should  be  noted  that  these 
comparisons  are  based  on  Just  two  examples  using  a  two-way  ANOVA.  The  differences 
may  become  more  apparent  for  more  complex  designs. 
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ABSTRACT 

Tha  ICC  ia  a  paraonal  camouflage  nat  for  aoldiara  which  will  ba  uaeful 
for  patrola,  enipara,  and  ambuah  aituation*.  Thia  atudy  datarminad  whathar 
tha  ICC  ahould  hava  large  or  amall  Hogan  inciaiona,  and  what  color(a)  beat 
blended  with  tha  deaert  backgrounda .  Tan  U.S.  Marinaa  and  two  civiliana 
aubjactivaly  evaluated  seventy-four  ICCe  (thirty-eeven  different  colora  half 
large  and  half  amall  Hogan  inciaiona)  at  five  deaert  aitaa.  Tha  ICCa  ware 
ranked  in  groupa  of  aix,  aalacting  four  at  a  time,  to  reduce  tha  number  to  tha 
final  aix  colora  with  aaaociatad  inciaiona.  Tha  final  aix  ware  subjected  to 
paired  compariaon  rankinga  which  overcomea  tha  problem  of  inconaiatancy  of 
judgements  given  by  tha  name  obaarver.  Tha  data  waa  analysed  atatiatically  to 
determine  preferred  color  with  aaaociatad  inciaion,  aatabliah  confidence 
limita,  and  color  grouping  for  each  aita  and  acroaa  all  aitaa. 


1.0  SECTION  1  -  INTRODUCTION 

The  Count araurvail lane a  and  Deception  Diviaion  waa  taaked  by  FORSCOM  in 
early  1986  to  develop  tha  individual  camouflage  cover  (ICC)  for  deaert, 
woodland,  and  anow  environments.  Tha  ICC  ia  a  amall  cloth  cover,  5'  x  7' , 
which  will  weigh  about  10-14  ounces,  and  ba  able  to  fit  into  a  battle  dree* 
uniform  pocket  whan  not  being  used.  It  will  deny  tha  detection  of  a  prone 
eoldier  in  an  ambush  situation,  or  when  on  a  surveillance,  long-range  patrol 
situation.  Tha  purpose  of  this  study  was  twofold.  Tha  taak  first  waa  to 
determine  if  a  amall  or  large  Hogan  garnish  incision  was  best.  Tha  second 
taak  was  to  determine  tha  bast  desert  color  to  accompany  the  inciaion.  Five 
aitaa  ware  selected  in  tha  desert  southwest,  and  tha  ICCa  were  evaluated  by 
ground  observers  ae  to  how  well  they  blended  with  tha  desert  backgrounda. 

2.0  SECTION  2  -  PROCEDURE 

2.1  Teat  ICCa. 

Thera  were  a  total  of  thirty-aavan  variations  of  deaert  colors  for  thia 
atudy.  The  nucleus  of  these  colors  waa  taken  from  the  Saudi  Arabian  net 
palette  atudy.  These  original  colors  ware  teaced  in  the  deserts  of  Saudi 
Arabia2/  and  the  U.S.  desert  southwest.  Additional  colore  were  obtained 
through  modification.  Each  of  thirty-seven  colors  were  painted  on  aevanty- 
four  vinyl-coated  sheets,  5'  x  7*  ,  which  were  then  inciaed  with  either  the 
amall  or  large  Hogan  inciaion.  Thus,  there  was  a  total  of  seventy-four 
vinyl-coated  ICCa  -  thirty-seven  amall  Hogans  and  thirty-seven  large  Hogane. 
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2.2  Test  Sites. 


Five  sites  were  used  to  evaluate  the  ICCs.  Two  of  the  sites  were  in  the 
Yuma,  Arizona  area,  two  at  Anza  Borrsgo  State  Park,  California,  and  one  at  Jean 
Lake,  near  Las  Vegas,  Nevada.  Both  sites  at  Ansa  Borrego  State  Park  were  jandy 
with  small  stones.  Vegetation  was  very  spares.  Yuma  site  #1  was  very  sandy  with 
some  vegetation,  while  Yuma  site  #2  was  on  Ogilby  Road  and  was  rocky  with  very 
spares  vegetation.  The  Jear.  Lake  site  contained  moderate  vegetation  with  rocks, 
and  was  located  on  a  hillside. 

2.3  Test  Subjects. 

The  test  subjects  consisted  of  ten  enlisted  U.S.  Marine  Corps  personnel  from 
Camp  Pendleton,  California,  and  two  civilians  from  the  Belvoir  Research,  Develop¬ 
ment,  and  Engineering  Center,  Fort  Belvoir,  Virginia.  All  personnel  had 
corrected  20/20  vision  and  normal  color  vision.  No  observations  were  made  with 
sunglasses . 

2.4  Data  Generation. 

The  seventy-four  Hogan  incised  ICCs  were  randomly  assigned  to  groups  of  six 
each.  The  four  that  best  blended  with  the  desert  environment,  in  terms  of  color 
and  texture,  were  selected  and  put  aside  for  additional  evaluations.  This 
process  continued  until  the  original  seventy-four  ICCs  were  reduced  to  the  six 
best.  The  best  six  ICCs  ware  then  shown  in  all  possible  pairs  -  fifteen,  with 
the  best  ICC  for  each  pair  chosen  for  ability  to  blend  with  the  desert.  The 
number  of  times  the  individual  ICC  was  judged  to  be  the  best  was  tabulated  and 
subjected  to  data  analysis. 


3.0  SECTION  3  -  RESULTS 

The  ICCs  were  evaluated  at  each  of  the  five  sites  to  determine  which  colors 
best  blended  with  the  desert  environment.  Section  2.4  describes  how  the  best  six 
ICCs  were  selected  for  each  site.  Table  1  shows  the  top  six  colors  for  each  of 
the  five  sites. 
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TABLE  1 

Summary  of  the  Best  Six  Desert  ICCs  for  Each  Site 


Site 


Colors 

Yuma 

Site  1 

Yuma 

Site  2 

Jean  Lake 

Anza  Borrego 
Site  1 

Anza  Borrego 
Site  2 

P6~S 

X 

W-S 

X 

X 

X 

XI-S 

X 

X 

X 

XI-L 

X 

X 

X 

12-S 

X 

21-S 

X 

X 

21-L 

X 

X 

X 

26-S 

X 

X 

X 

X 

26-L 

X 

X 

33-S 

X 

X 

X 

X 

X 

33-L 

X 

X 

37-S 

X 

NOTE:  The  L  is  large  Hogan  incision,  while  S  is  small  Hogan  incision.  Net 
33-S  is  the  only  color  to  make  the  best  six  colors  for  all  five  sites. 
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The  result*  of  each  site  for  the  above  six  best  nets  will  not  be 
included,  because  they  would  be  coo  voluminous  to  present  in  these  pro¬ 
ceedings.  This  data  is  available  upon  request  from  the  U.S.  Army  Belvoir 
Research,  Development  and  Engineering  Center,  ATTN:  STRBE-JD8,  Fort  Bui voir, 
VA  22060.  Whan  averaging  the  final  best  six  ICCs  across  all  five  sites,  a 
total  of  twelve  1C'*  mad*  the  best  list.  Some  nets  such  as  37-S  made  the 
final  six  ICCs  for  only  on*  sit*.  A  value  of  sero  was  added  for  each  cell 
block  when  the  ICC  did  not  make  the  final  six  for  that  particular  sit*. 

Tables  2-4  contain  the  statistics  for  the  twelve  ICCs.  Figure  1  is  the 
graphic  display  of  Table  2.  Table  S  describes  the  final  twelve  ICC  nets  as  to 
color  and  incision. 

TABLE  2 

Descriptive  Data  for  Final  ICCs  (Color  Blend) 
with  Desert  Background,  Across  Alt  Sites 


& 


COLOR 

N 

MEAN 

STANDARD 

ERROR 

45X  CONFIDENCE 
LOWER  L1M 

INTERVAL 
UPPER  LIM 

P6-S 

59 

0.1864 

0.6010 

0.0298 

0.3431 

W-S 

59 

1.4237 

1.6422 

0.9957 

1.8517 

XI-S 

59 

1.5932 

1.5550 

1.1879 

1.9985 

y.i-L 

59 

1.6780 

1.8795 

1.1881 

2.1678 

1 2— S 

59 

0.1017 

0.6616 

0.0000 

0.2741 

21  -S 

59 

0.9153 

1 . 3808 

0.5554 

1.2751 

21  -L 

59 

0.9831 

1.2931 

0.6460 

1.3201 

26-S 

59 

2.8983 

1.8541 

2.4151 

3.3816 

26-L 

59 

1.2712 

1 . 7304 

0.8202 

1.7222 

3  3  — S 

59 

2.7119 

1.4026 

2.3463 

3.0774 

33-L 

59 

0.6610 

1.1539 

0.3603 

0.9618 

37-S 

59 

0.5763 

1.2206 

0.2581 

0.8944 

1$ 


I 


tot! 


Note  that  the  higher  the  mean  value,  the  better  the  ICC  blended  with  the 
desert  environments. 

TABLE  3 

Analysis  of  Variance  for  Final  ICCs  (Color  Blend) 
with  Desert  Background,  Across  All  Sites 


SOURCE 

OF 

SUM  OF  SQUARES 

MEAN  SQUARE 

F-TEST 

SIC  LRvEL 

Color 

11 

508.5466 

46.2315 

22.8823 

0.0000* 

Error 

696 

1406.2034 

2.0204 

Total 

707 

1914. 75C0 

*  Significant  at  a  less  then  .001  level. 

This  table  indicates  that  there  are  significant  differences  in  the 
ability  of  the  final  ICCs  to  blend  with  the  desert  backgrounds.  Table  4 
identifies  which  ICCs  are  significantly  different  from  each  other. 


COLORS 

Figure  1.  Ability  of  the  Final  ICC*  to  Blend  with 
Deeert  Background,  Averaged  Acroea  All  Sitea. 

TABLE  4 


Individual  Compariaona,  Identifying  Which  of  the  Final 
ICC  Colors  Differed  Significantly  from  Each  Other, 
Averaged  Acroaa  Sitea 


COLOR  Pf-S 
COMPARISON  - 
F  -  22,352 


AND  COLOR  W-S 

-1.23729  SUM  OF  SQUARES  -  45.16)02 

SIGNIFICANCE  LEVEL  -  0.00000  +** 


COLOR  P6-S 
COMPARISON  - 
F  -  28.396 

COLOR  P6-S 
COMPARISON  ■ 

F  ■  32.482 


COLOR  P6-S 
COMPARISON 
F  - 


AND  COLOR  XI-S 

-1.40678  SUM  OF  SQUARES  -  58.38136 

SIGNIFICANCE  LEVEL  ■  0.00000  *** 

AND  COLOR  XI-L 

-1.49153  SUM  OF  SQUARES  -  65.62712 

SIGNIFICANCE  LEVEL  -  0.00000  *** 


AND  COLOR  1 2  —  S 

-  0.08475  SUM  OF  SQUARES 

0.105  SIGNIFICANCE  LEVEL  - 


0.21186 


1 .00000 
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TABLE  4  (Cont) 


COLOR  P6-8 

AND  COLOR  21-S 

COMPARISON 

m 

-0.72881  SUM  OF  SQUARES 

■ 

15.66949 

F  « 

7.756 

SIGNIFICANCE  LEVEL  • 

0.00543 

** 

COLOR  P6-S 

AND  COLOR  21-L 

COMPARISON 

■ 

-0.79661  SUM  OF  SQUARES 

m 

18.72034 

P  - 

9.266 

SIGNIFICANCE  LEVEL  - 

0.00238 

#* 

COLOR  P6-S 

AND  COLOR  26-S 

COMPARISON 

■ 

-2.71186  SUM  OF  SQUARES 

m 

216.94915 

F  •  107.379 

SIGNIFICANCE  LEVEL  - 

0.00000 

COLOR  P6-S 

AND  COLOR  26 -L 

COMPARISON 

m 

-1 .08475  SUM  OF  SQUARES 

m 

34.71186 

r  - 

17.181 

SIGNIFICANCE  LEVEL  - 

0.00004 

*** 

COLOR  P6-S 

AND  COLOR  33-S 

COMPARISON 

■ 

-2.52542  SUM  OF  SQUARES 

m 

188.14407 

F  •  93.122 

SIGNIFICANCE  LEVEL  > 

0.00000 

*** 

COLOR  P6-S 

AND  COLOR  33-L 

COMPARISON 

■ 

-0.47458  SUM  OF  SQUARES 

■ 

6.64407 

F  - 

3.288 

SIGNIFICANCE  LEVEL  - 

0.06998 

COLOR  P6-S 

AND  COLOR  37-S 

COMPARISON 

■ 

-0.38983  SUM  OF  SQUARES 

■ 

4.48305 

F  • 

2.219 

SIGNIFICANCE  LEVEL  - 

0.13656 

sOLOR  W-S 

AND  COLOR  XI -S 

COMPARISON 

■ 

-0.16949  SUM  OF  SQUARES 

■ 

0.84746 

F  • 

0.419 

SIGNIFICANCE  LEVEL  - 

0.51732 

COLOR  W-S 

AND  COLOR  XI-L 

COMPARISON 

m 

-0.25424  SUM  OF  SQUARES 

■ 

1.90678 

F  » 

0.944 

SIGNIFICANCE  LEVEL  - 

0.33148 

COLOR  W-S 

AND  COLOR  12-S 

COMPARISON 

m 

1.32203  SUM  OF  SQUARES 

m 

51.55932 

?  « 

25.519 

SIGNIFICANCE  LEVEL  - 

0.00000 

*+* 

COLOR  W-S 

AND  COLOR  21-S 

COMPARISON 

■ 

0.50847  SUM  OF  SQUARES 

■ 

7.62712 

F  - 

3.775 

SIGNIFICANCE  LEVEL  - 

0.05222 

COLOR  W-S 

AND  COLOR  21-L 

COMPARISON 

m 

0.44068  SUM  OF  SQUARES 

■ 

5.72881 

F  • 

2.835 

SIGNIFICANCE  LEVEL  - 

0.09243 

COLOR  W-S 

AND  COLOR  26-S 

COMPARISON 

m 

-1.47458  SUM  OF  SOUARES 

■ 

64.14407 

F  - 

31.748 

SIGNIFICANCE  LEVEL  - 

0.00000 

*** 

TABLE  4  (Cont) 


COLOR  W-S 
COMPARISON  - 

AND  COLOR  26 -L 

0.15254  SUM  OF  SQUARCS 

■ 

0.68644 

F  ■ 

0.340 

SIGNIFICANCE  LEVEL  * 

0.56006 

COLOR  W-S 
COMPARISON  • 

AND  COLOR  33 -S 
-1.28814  8UM  OF  SQUARES 

■ 

48.94915 

P  • 

24.227 

SIGNIFICANCE  LEVEL  - 

0.00000 

COLOR  W-S 
COMPARISON  - 

AND  COLOR  33 -L 

0.76271  SUM  OF  SQUARES 

• 

17.16102 

F  - 

8.494 

SIGNIFICANCE  LEVEL  • 

0.00362 

#* 

COLOR  W-S 
COMPARISON  - 

AND  COLOR  37 -8 

0.84746  SUM  OF  SQUARES 

• 

21 . 18644 

F  - 

10.486 

SIGNIFICANCE  LEVEL  - 

0.00123 

** 

COLOR  Xl-S 
COMPARISON  • 

AND  COLOR  XI-L 
-0.08475  SUM  OF  SQUARES 

m 

0.21186 

F  - 

0,105 

SIGNIFICANCE  LEVEL  - 

1 . 00000 

COLOR  XI-S 
COMPARISON  - 

AND  COLOR  12-S 

1.49153  8UM  OF  SQUARES 

■ 

65.62712 

F  ■ 

32.482 

SIGNIFICANCE  LEVEL  - 

0.00000 

*** 

COLOR  XI-S 
COMPARISON  • 

AND  COLOR  21-8 

0.67797  SUM  OF  SQUARES 

■< 

13.55932 

F  - 

6.711 

SIGNIFICANCE  LEVEL  • 

0.00968 

** 

COLOR  XI-S 
COMPARISON  - 

AND  COLOR  21-L 

0.61017  SUM  OF  SQUARES 

m 

10.98305 

F  - 

5.436 

SIGNIFICANCE  LEVEL  - 

0.01987 

* 

COLOR  XI-S 
COMPARISON  - 

AND  COLOR  26-S 
-1.30508  SUM  OF  SQUARES 

■ 

50.24576 

F  • 

24.869 

SIGNIFICANCE  LEVEL  - 

0.00000 

*** 

COLOR  XI-S 
COMPARISON  - 

AND  COLOR  26-L 

0.32203  8UM  OF  SQUARES 

■ 

3.05932 

F  - 

1.514 

SIGNIFICANCE  LEVEL  - 

0.21870 

COLOR  XI-S 
COMPARISON  - 

AND  COLOR  33 -S 
-1.11864  SUM  OF  SQUARES 

. 

36.91525 

F  • 

18.271 

SIGNIFICANCE  LEVEL  ■ 

0.00002 

■kirk 

COLOR  XI-S 
COMPARISON  - 

AND  COLOR  33-L 

0.93220  SUM  OF  SQUARES 

■ 

25.63559 

F  - 

12.68F 

SIGNIFICANCE  LEVEL  - 

0.00038 

*** 

COLOR  XI-S 
COMPARISON  - 

AND  COLOR  37-8 

1.01695  SUM  OF  8QUARES 

■ 

30.50847 

F  • 

15.100 

SIGNIFICANCE  LEVEL  - 

0.00011 

TABLE  4  (Cont) 


COLOR  XI-L 

AND  COLOR  12-S 

COMPARISON  - 

1.57627  SUM  OF  SQUARES 

■ 

73.29661 

F  ■  36.278 

SIGNIFICANCE  LEVEL  ■ 

0.00000 

COLOR  XI-L 

AND  COLOR  21-S 

COMPARISON  - 

0.76271  SUM  OF  SQUARES 

« 

17.16102 

r  -  8.494 

SIGNIFICANCE  LEVEL  - 

0.00362 

COLOR  XI-L 

AND  COLOR  21-L 

COMPARISON  - 

0.69492  SUM  OF  SQUARES 

■ 

14.24576 

F  »  7.051 

SIGNIFICANCE  LEVEL  ■ 

0.00801 

** 

COLOR  XI-L 

AND  COLOR  26-S 

COMPARISON  - 

-1.22034  SUM  OF  SQUARES 

■ 

43.93220 

F  •  21.744 

SIGNIFICANCE  LEVEL  ■ 

0.00000 

*★* 

COLOR  XI-L 

AND  COLOR  26-L 

COMPARISON  « 

0.40678  SUM  OF  SQUARES 

■I 

4.88136 

F  ■  2.416 

SIGNIFICANCE  LEVEL  ■ 

0.12032 

COLOR  XI-L 

AND  COLOR  33 -S 

COMPARISON  - 

-1.03390  SUM  OF  SQUARES 

■ 

31.53390 

F  -  15.608 

SIGNIFICANCE  LEVEL  ■ 

0.00008 

*** 

COLOR  XI-L 

AND  COLOR  33-L 

COMPARISON  - 

1.01695  SUM  OF  SQUARES 

■ 

30.50847 

F  ■  15.100 

SIGNIFICANCE  LEVEL  ■ 

0.00011 

*★* 

COLOR  XI-L 

AND  COLOR  37-S 

COMPARISON  - 

1.10169  SUM  OF  SQUARES 

IB 

35.80508 

F  -  17.722 

SIGNIFICANCE  LEVEL  ■ 

0.00003 

*** 

COLOR  12-S 

AND  COLOR  21-S 

COMPARISON  - 

-0.81356  SUM  OF  SQUARES 

m 

19.52542 

F  «  9.664 

SIGNIFICANCE  LEVEL  ■ 

0.00192 

** 

COLOR  12-S 

AND  COLOR  21-L 

COMPARISON  - 

-0.88136  SUM  OF  SQUARES 

■ 

22.91525 

F  -  11.342 

SIGNIFICANCE  LEVEL  ■ 

0.00078 

*** 

COLOR  12-S 

AND  COLOR  26-S 

COMPARISON  « 

-2.79661  SUM  OF  SQUARES 

■ 

230.72034 

F  -  114.195 

SIGNIFICANCE  LEVEL  - 

0.00000 

COLOR  12-S 

AND  COLOR  26-L 

COMPARISON  ■ 

-1.16949  SUM  OF  SQUARES 

■ 

40.34746 

F  ■  19.970 

SIGNIFICANCE  LEVEL  - 

0.00001 

★*# 

COLOR  12-3 

AND  COLOR  33— S 

COMPARISON  • 

-2.61017  SUM  OF  SQUARES 

* 

200.98305 

F  •  99.477 

SIGNIFICANCE  LEVEL  ■ 

0.00000 

★** 

TABLE  4  (Cont) 


COLOR  12-S 
COMPARISON 

■ 

AND  COLOR  33-L 
-0.55932  SUM  OF  SQUARES 

■ 

9.22881 

F  • 

4.568 

SIGNIFICANCE  LEVEL  - 

0.03275 

★ 

COLOR  12-S 
COMPARISON 

■ 

AND  COLOR  37— S 
-0.47458  SUM  OF  SQUARES 

■ 

6.64407 

F  • 

3.288 

SIGNIFICANCE  LEVEL  - 

0.06998 

COLOR  2 1  — S 
COMPARISON 

m 

AND  COLOR  21-L 
-0.06780  SUM  OF  SQUARES 

m 

0.13559 

F  ■ 

0.067 

SIGNIFICANCE  LEVEL  - 

1 . 00000 

COLOR  21-S 
COMPARISON 

m 

AND  COLOR  26 -S 
-1,98305  SUM  OF  SQUARES 

. 

116.00847 

F  ■  57.418 

SIGNIFICANCE  LEVEL  - 

0.00000 

**w 

COLOR  2.1 -S 
COMPARISON  ■ 

AND  COLOR  26-L 
-0.35593  SUM  OF  SQUARES 

■ 

3.73729 

F  ■ 

1.850 

SIGNIFICANCE  LEVEL  - 

0.17403 

COLOR  21-S 
COMPARISON  - 

AND  COLOR  33 -S 
-1.79661  SUM  OF  SQUARES 

■ 

95.22034 

F  ■ 

47.129 

SIGNIFICANCE  LEVEL  - 

0.00000 

*** 

COLOR  21-S 
COMPARISON  - 

AND  COLOR  33-L 

0.25424  SUM  OF  SQUARES 

m 

1.90678 

F  - 

0.944 

SIGNIFICANCE  LEVEL  - 

0.33148 

COLOR  21-S 
COMPARISON  - 

AND  COLOR  37 — S 

0.33898  SUM  OF  SQUARES 

m 

3.38983 

F  ■ 

1.678 

SIGNIFICANCE  LEVEL  ■ 

0.19543 

COLOR  21-L 
COMPARISON  - 

AND  COLOR  26-S 
-1.91525  SUM  OF  SQUARES 

m 

108.21186 

F  ■ 

53.559 

SIGNIFICANCE  LEVEL  - 

0.00000 

**★ 

COLOR  21-L 
COMPARISON  - 

AND  COLOR  26-L 
-0.28814  SUM  OF  SQUARES 

■ 

2.44915 

F  • 

1.212 

SIGNIFICANCE  LEVEL  • 

0.27108 

COLOR  21-L 
COMPARISON  ■ 

AND  COLOR  33 —S 
-1.72881  SUM  OF  SQUARES 

m 

88.16949 

F  • 

43.639 

SIGNIFICANCE  LEVEL  - 

0.00000 

★** 

COLOR  21-L 
COMPARISON  - 

AND  COLOR  33-L 

0.32203  SUM  OF  SQUARES 

m 

3.05932 

F  « 

1.514 

SIGNIFICANCE  LEVEL  - 

0.21870 

COLOR  21-L 
COMPARISON  ■ 

AND  COLOR  3  7 — S 

0.40678  SUM  OF  SQUARES 

m 

4.88136 

F  ■ 

2.416 

SIGNIFICANCE  LEVEL  - 

0.  12032 
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TABLE  4  (Cent) 


COLOR  26-S 

AND  COLOR  26-L 

COMPARISON  • 

1.62712  SUM  OF  SQUARES 

■ 

78.10169 

F  -  38.656 

SIGNIFICANCE  LEVEL  ■ 

0.00000 

★** *** 

COLOR  26-S 

AND  COLOR  33-S 

COMPARISON  - 

0.18644  SUM  OF  SQUARES 

a 

1.02542 

F  -  0.508 

SIGNIFICANCE  LEVEL  « 

0.47633 

COLOR  26-9 

AND  COLOR  33-L 

COMPARISON  - 

2.23729  SUM  OF  SQUARES 

■ 

147.66102 

F  ■  73.085 

SIGNIFICANCE  LEVEL  - 

0.00000 

COLOR  26-S 

AND  COLOR  37-S 

COMPARISON  - 

2.32203  SUM  OF  SQUARES 

■ 

159.05932 

F  ■  73.726 

SIGNIFICANCE  LEVEL  - 

0.00000 

*** 

COLOR  26-L 

AND  COLOR  33-S 

COMPARISON  - 

-1.44068  SUM  OF  SQUARES 

- 

61.22881 

F  -  30.305 

SIGNIFICANCE  LEVEL  ■ 

0.00000 

*** 

COLOR  26-L 

AND  COLOR  33-L 

COMPARISON  - 

0.61017  SUM  OF  SQUARES 

m 

10.98305 

F  ■  5.436 

SIGNIFICANCE  LEVEL  - 

0.01987 

* 

COLOR  26-L 

AND  COLOR  37-S 

COMPARISON  ■ 

0.69492  SUM  OF  SQUARES 

■ 

14.24576 

F  -  7.051 

SIGNIFICANCE  LEVEL  - 

0.00801 

** 

COLOR  33-S 

AND  COLOR  33-L 

COMPARISON  - 

2.05085  SUM  OF  SQUARES 

■ 

124.07627 

F  -  61.412 

SIGNIFICANCE  LEVEL  - 

0.00000 

★** 

COLOR  33-S 

AND  COLOR  37-S 

COMPARISON  - 

2.13559  SUM  OF  SQUARES 

m 

134.54237 

F  -  66.592 

SIGNIFICANCE  LEVEL  ■ 

0.00000 

*** 

COLOR  33-L 

AND  COLOR  37-S 

COMPARISON  - 

0.08475  SUM  OF  SQUARES 

m 

0.21186 

F  ■  0.105 

SIGNIFICANCE  LEVEL  ■ 

1.00000 

The  following  lCCa  differed  aigni f icantly  from 

each  other:  P6- 

P6-S  va.  XI-S,  P6-S  va.  XI 

-L,  P6-S  va.  21-S,  P6-S  va.  21-L, 

P6-S  va. 

P6-S  va.  26-L,  P6-S  va.  33 

-S,  W-S  va.  12-S,  W-S  va. 

26-S,  W 

-S  va.  33 

va.  33-L,  W-S  va.  37-S,  XI 

-S  va.  12-S,  XI-S  va.  21- 

S,  XI-S 

va.  21-L, 

26-S,  XI-3  va.  33-S,  XI-S 

va.  33-L,  XI-S  vo.  37-S,  : 

XI-L  va. 

12-S,  XT 

21-S,  XI-L  va.  21-L,  Xl-L 

va.  26-S,  XI-L  vs.  33-S, 

XI-L  va. 

33-L,  XI 

37-S,  12-S  va.  21-S,  12-S 

''a.  21-L,  12-S  va.  26-S, 

12-S  va. 

26-L,  12 

33-3,  12-S  va.  33-L,  21-S 

va.  26-S,  21-S  va.  33-S, 

21-L  va. 

26-S,  21 

33-S,  26-S  va.  26-L,  26-S 

va.  33-L,  26-S  va.  37-S, 

26-L  va. 

33-S,  26 

33-L,  26-L  va.  37-3,  33-d 

va.  33-L,  and  33-S  va.  37 

-S. 

*  Significant  at  Q  lea*  than  .05  level, 

**  Significant  at  a  lee*  than  .01  level. 

***  Significant  at  at  leaa  than  .001  level. 


va .  W 
26-S, 
S,  W-S 
XI-S  v 
l  v*. 

L  va. 

S  va . 

L  va. 

■L  va. 


TABLE  5 


Physical  Description  of  the  Final  Twelve  ICC*. 


COLOR/ INCISION 


DESCRIPTION 


P6-S  Black  apota  on  tan  color  26,  color  XI 

on  ravarse  side. 

W-S  A  fifty-fifty  mixture  of  Saudi  Arabian 

color  8  and  7  in  both  sides  of  the  net. 

XI-S  Standard  tan  color  on  both  sides  of  the 

net . 

XI -L  Same  color  as  XI-S,  only  this  ICC  has 

large  incisions. 

New  color  on  both  sides  of  net. 

Color  XI  on  one  side  of  the  net,  new 
color  33  on  the  other  side. 


21-L 

Same  color  as  21 , 
large  incisions. 

only  this  ICC 

has 

26-S 

New  color  on  both 

sides  of  net. 

26-L 

Same  color  as  26, 
large  incisions. 

only  this  ICC 

han 

33-S 

New  color  on  both 

sides  of  net. 

33-L 

Same  color  as  33, 
large  incisions. 

only  this  ICC 

has 

3  7 — S 

Color  XI  on  ona  side  of  the  net 
color  W  on  the  other  side. 

,  with 

Note  that  S  is  email  Hogan  incisions,  while  L  is  large  Hogan  incisions. 

4.0  SECTION  4  -  DISCUSSION 

All  the  colors  were  on  the  gray  or  tan  scale,  with  the  tan  colors  rated 
as  having  the  most  ability  to  blend  with  the  desert  background.  Table  1  shows 
that  the  pattern  ICC  nat  P6-S  was  the  only  multi-color  to  make  the  final 
twelve  ICCs,  and  it  along  with  nat  12-S  was  Judged  by  the  ground  observers  as 
having  tha  least  ability  to  blend  with  the  desert  background  when  averaged 
across  all  five  sites.  Net  33— S  was  the  only  net  to  make  the  final  six  for 
all  sites.  ICC  26-S  was  a  final  net  for  all  sitee,  except  for  Yuma  site  # 2. 
These  nete  did  not  significantly  differ  from  each  other  ( a  ■  0.476),  with  net 
33-S  having  a  preference  rating  of  3.07  to  3.38  for  net  26-S.  Tha  Yuma  site 
#2  area  was  very  rocky,  while  the  other  sites  were  very  sandy.  The  test  team 
has  seen  deserts  in  Egypt  and  Saudi  Arabia,  and  these  deserts  *urs  very  sandy. 
Therefore,  net  26-S  appears  to  be  the  best  ICC  for  general  desert  use.  This 
color  was  among  the  best  six  at  Yuma  site  #2,  only  it  had  large  Hogan  inci- 
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12-S 

21-S 


•ion*  (26-L).  The  texture  of  the  rock*  it  larger  and  more  rough  in  appearance 
than  that  of  *and.  It  appear*  that  the  texture  of  the  rock9  was  the  driving 
force  in  the  *election  of  26-L  rather  than  26-S.  Four  of  the  top  five  ICCs , 
26— S ,  33-3,  Xl-S  and  W-S,  were  email  incisions.  The  only  exception  it  ICC 
XI-L.  Except  for  very  rocky  deaerts,  the  small  incision  blends  best  with  the 
texture  of  the  desert  floor.  Desert  color  paint  studies2 >3 ,4/  have  shown  that 
the  desert  southwest  is  a  darker  more  gray  desert  than  those  seen  in  Saudi 
Arabia  and  Egypt.  Additional  deserts  of  interest  in  the  Middle  East  should  be 
photographed  and  soil  samples  studied  before  a  final  decision  is  made  for  the 
colors  26  and  33. 

5.0  SECTION  5  -  SUMMARY  AND  CONCLUSIONS 

A  total  of  thirty-seven  colors  were  painted  on  seventy-four  vinyl-coated 
sheets  5'  x  7' .  Each  color  was  given  either  the  small  or  large  Hogan 
incision.  These  ICCs  were  then  taken  to  five  sites  in  the  desert  southwest 
and  evaluated  as  to  thair  ability  to  blend  with  the  desert  background  in  terms 
of  color  and  texture.  Ten  enlisted  U.S.  Marine  Corps  personnel  from  Camp 
Pendleton,  California,  and  two  civilians  from  the  Belvoir  Research, 

Development  and  Engineering  Center,  Fort  Belvoir,  Virginia,  served  as  ground 
observers.  The  seventy-four  ICCs  were  randomly  assigned  to  groups  of  six 
each.  The  four  ICCs  that  best  blended  with  the  desert  environment  were 
•elected  and  put  aside  for  additional  evaluation  which  continued  until  the 
best  six  for  each  site  remained.  These  best  six  ICCs  were  then  viewed  on  all 
possible  pairs  (15),  with  the  best  selected  for  each  pair  in  their  ability  to 
match  the  desert  floor.  The  number  of  times  the  individual  ICC  was  judged  to 
be  best  was  tabulated  and  subjected  to  data  analysis.  The  following 
conclusions  were  drawn: 

a.  Colors  26  and  36  were  the  most  effective  in  blending  with  the  desert. 

b.  Color  26  was  selected  for  initial  ICC  production. 

c.  The  small  Hogan  incision  (S)  is  more  effective  than  the  large  Hogan 
incision  (l)  except  for  very  rocky  terrain. 


d.  The  U.S.  desert  southwest  is  darker  and  more  gray  than  the  sites  seen 
in  the  Middle  East,  making  additional  work  on  the  two  colors  necessary  before 
final  color  selection. 
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The  Combinatorics  of  Message  Filtering 


Terence  M.  Cronin 

US  Army  Signals  Warfare  Center,  Warrenton,  Virginia 

Topic:  Computational  Aspects  of  Event  Recognition  Under  Conditions  of  Sparse 
Reporting,  Uncertainty,  and  Information  Decay 


[Background:  The  general  problem  of  filtering  a  stack  of  documents  is  arguably 
context-sensitive,  i.e.,  an  individual  document  cannot  be  prioritized  independently  of 
semantic  knowledge  about  the  current  environment.  In  pursuing  this  line  of  thought, 
an  attempt  is  being  made  to  recognize  background  events  which  change  dynamically 
in  time,  with  the  ultimate  motivation  being  to  assess  the  import  of  any  given  message 
with  respect  to  the  time-criticality  of  the  most  recent  set  of  events. 

[Abstract:  Given  a  set  of  message  traffic  and  an  exhaustive  menu  of  possible  events, 
select  the  event  which  is  best  explained  by  the  message  data.  This  problem  involves  a 
reasoning  process  known  as  abduction,  as  differentiated  from  the  processes  of 
deduction  and  induction.  An  argument  is  made  that  the  recognition  of  events  from 
message  data  is  a  diagnosis  problem.  In  the  medical  world,  disorders  are  diagnosed 
from  observation  of  symptoms.  In  the  case  of  electronic  troubleshooting,  failure  of  a 
whole  circuit  may  be  explained  by  failure  of  single  components  or  sets  of  components 
In  the  general  sense,  an  event  may  be  diagnosed  by  careful  observation  of  the 
constituent  phenomena  which  comprise  the  event.  With  respect  to  battlefield 
situation  assessment,  both  the  manifestations  for  events  and  the  events  themselves 
change  dynamically  as  more  message  traffic  enters  the  system,  since  the  decay  of  one 
event  is  accompanied  by  the  emergence  of  another  over  time.  This  paper  develops  a 
formal  theory  of  machine-assisted  event  recognition,  but  also  casts  an  eye  on  the 
feasibility  of  implementation.  Treated  with  some  rigor  are  the  combinatorics 
associated  with  such  new  formalisms  as  suspecting  an  event;  confirming  an  event; 
computing  the  threat  of  an  event;  revoking  a  stale  event;  introducing  two  levels  of 
relaxation  into  statistical  testing;  recovering  from  fundamental  forms  of  string  error, 
and  the  number  of  feasible  ways  to  filter  a  stream  of  n  messages. 


Fundamentals:  Definitions  and  Concepts. 

A  message  m  is  a  feature  vector  together  with  a  string  of  text:  m  *  {xi, 
X2»  ....  *n.  x$tring)  T^e  feature  vector  is  a  set  of  sensor-measurable  observable 
attributes  of  a  manmade  object.  The  string  represents  natural  language  which  may 
have  been  generated  by  one  of  two  communicators:  either  an  individual  who  in  some 
way  controls  the  manmade  object,  or  an  outside  observer  describing  the  interaction  of 
the  object  with  the  world. 

The  timeliness  of  a  message  mj  is  the  time  tj  at  which  its  feature  vector 
was  created  (time  at  sensor  detection).  A  message  mj  is  said  to  be  more  timely  than 
message  mj  iff  tj  >  tj. 


A  map  y  is  a  spatially  organized  representation  of  a  section  of  the  world 
upon  which  the  manmade  objects  referenced  in  messages  move  about. 

A  constituent  phenomenon  g  is  a  logical  function  of  message  data 
conjoined  with  map  data.  If  the  expression  g(xi,y)  evaluates  to  true,  then  g(xj,y)  *  1; 
otherwise  g(xj,y)  =  0 

An  event  e  is  a  set  of  constituent  phenomena :  e  =  {gi ,  g2.  •••.  gk) 


The  message  set  M  is  the  set  of  all  messages;  M  =  {mj  |  i  ■  1,n}. 


The  event  space  E  is  the  set  of  all  events:  E  ■  {ej|i  *  1,m}. 


Becoming  Suspicious,  Amassing  Support,  and  Confirming  Event  Hypotheses. 

The  problem  of  recognizing  an  event  by  aggregating  the  truth  or  falsity 
of  its  message-derived  constituent  phenomena  is  being  treated  as  a  diagnosis  problem. 
Message-driven  event  recognition  must  avail  itself  of  a  reasoning  process  known  as 
abduction  (as  contrasted  with  induction  and  deduction),  in  which  the  event  which  best 
explains  the  message  data  is  selected  as  the  most  likely  hypothesis,  even  when  the 
message  data  is  incomplete,  subject  to  some  error,  and  describe  temporally  transient 
phenomena.  This  form  of  automated  reasoning  is  still  very  much  a  research  issue,  with 
several  disjoint  efforts  seemingly  offering  potential  leverage.  An  abductive 
inferencing  mechanism  is  being  explored  for  a  medical  domain,  by  assembling  those 
hypotheses  which  are  best  explained  by  a  set  of  data  [J1].  There  has  been  promising 
work  recently  in  the  areas  of  justification-based  and  assumption-based  truth 
maintenance  systems  (D1,  D2,  Ml].  These  techniques  achieve  truth  maintenance  by 
detecting  inconsistency,  followed  respectively  by  dependency-directed  backtracking, 
or  by  gathering  the  most  general  context  which  preserves  consistency.  Yet  another 
interesting  line  of  research  is  a  minimal  covering  set  theory  approach  (Nl,  PI],  which 
attempts  to  diagnose  medical  disorders  by  constructing  the  least  set  of  symptoms 
which  point  to  each  disorder.  However,  the  computational  feasibility  of  this  technique 
is  questionable,  since  derivation  of  the  minimal  covering  set  belongs  to  the  class  of 
NP-complete  problems  (Gl]. 

The  foundation  of  a  new  theory  of  event  recognition  emerges  if  one 
unifies  the  disciplines  of  truth  maintenance  systems  with  minimal  covering  sets.  If  a 
dimension  is  added  to  accomodate  other  than  temporally  static  situations,  then  the 
theory  permits  recognizing  events  from  their  manifestations,  when  both  the  events 
and  their  manifestations  may  be  changing  dynamically  in  time.  A  crucial  underpinning 
of  the  theory  is  that  the  emergence  of  a  new  event  is  inversely  proportional  to  the 
decay  of  an  older  event,  since  the  same  observable  primitive  resources  are  involved. 
Also  assumed  as  axiomatic  is  the  concept  that  full  credence  in  an  event  is  well  ni  gh 
impossible,  due  to  the  non-systematic  way  in  which  evidence  accrues,  together  with 
the  difficulty  in  retracting  an  assertion  once  it  is  assigned  a  probability  of  one  [K3] . 
Therefore,  the  theory  must  be  capable  of  confirming  events  when  only  partial  support 
is  manifested.  It  will  be  seen  that  this  becomes  feasible  if  one  is  permitted  to  revoke 
support  for  phenomena  which  have  already  emerged  and  sustained  under  both  spatial 
and  temporal  constraints 


A  message  m  is  said  to  support  an  event  e  if  and  only  if  there  exists  some 
feature  xj  of  m,  some  constituent  phenomenon  gk  of  e,  such  that  gk(xj)  evaluates  to 
true.  If  such  is  the  case,  we  also  say  that  phenomenon  gk  is  supported  by  m.  Any 
unsupported  phenomenon  of  e  is  called  a  virtual  phenomenon. 


Entropy  is  a  measure  of  information  not  available  to  make  a  decision 
about  an  event's  feasibility.  In  this  context,  entropy  is  synonymous  with  uncertainty. 

A  phenomenon  entropy  function  fp  assigns  to  each  constituent 
phenomenon  gk  of  an  event  ej  some  integer  value  nk  based  on  the  relative  utility  of 
gk:  fp(gk)  =  nk-  Phenomenon  gk  is  said  to  have  entropy  value  nk-  Small  values  are 
assigned  to  constituent  phenomena  which  are  of  minimal  use  in  the  decidibility  of 
event  ej. 


There  exists  sow?  object  X  which 
is  located  near  some  river  V  and 
X  is  constructing  a  bridge 


Although  no  such  object  has 
j  been  reported  upon,  it  X  is 
|  assured,  then  some  of  the  other 
constituent  phenomena  are  true. 


Object  X  has  historically  been 
shown  capable  of  conducting 
an  operation  of  this  semantic 

type.  [7j 


1 


I  ; 

All  other  objects  [Z)  which  i 

are  organizational ly  associated  j 

with  X  are  within  a  reasonable  | 

spatial  radius  of  X.  |  i 


i  There  exist  some  objects  of 
|  spatially  and  temporally 
!  configured  to  support  X. 

I 


The  coordinates  of  object  X  seem 
to  represent  a  reasonable  part 
of  river  V  for  conducting  a 
crossing  operation.  |  g| 


River  Crossing  Constituent  Phenomena 


Figure  1 .  Illustration  of  Constituent  Phenomena  and  Respective  Entropy  Values  for  a  Hypothetical 
Event.  Note  the  Subjunctive  Voice  of  the  Upper  Right  Phenomenon. 


The  total  entropy  Te  of  an  event  ej  is  the  sum  of  its  phenomenon  entropy 
values:  Te  =  Zfp(gk),  k  =  1,n. 
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The  Instantiated  entropy  /«  of  an  event  ej  is  the  sum  of  the  entropy  values 
associated  with  the  currently  supported  phenomena  of  ej. 

The  suspicion-ratio  for  an  event  ej  is  the  quotient  of  the  instantiated 
entropy  of  ej  with  the  total  entropy  of  ei. 


Figura  2.  A  single  message  supporting  some  constituent  phenomena  of  a  single  event,  with  total  and 
instantiated  entropy  values  illustrated,  together  with  the  instantaneous  suspicion-ratio. 

The  suspldon  accumulator  Sn(ei)  for  an  event  ej  is  a  temporal  sequence  of 
suspicion-ratios,  updated  whenever  a  new  message  is  processed. 

The  volume-ratio  for  an  event  ej  is  the  quotient  of  the  number  of 
messages  which  support  ej  with  the  total  number  of  messages  contained  within  a  time 
frame  of  interest. 

The  volume  accumulator  vn(ej)  for  an  event  ej  is  a  temporal  sequence  of 
volume-ratios,  updated  whenever  a  new  message  is  processed. 


The  suspicion-volume  accumulator  svn(ek)  for  an  event  ek  is  the 
sequence  defined  by  the  point-by-potnt  multiply  of  the  suspicion  accumulator  with  the 


volume  accumulator:  svn(*k)  *  (sifak)  *  v/fefc)  /  /  *  1,  n},  wh$re  n  is  the  number  of 
messages  processed  during  the  time  frame  of  interest. 

An  event  ej  warrants  suspicion-arousal  if  its  suspicion-ratio  exceeds  a 
specified  necessity  condition,  or  if  its  suspicion  accumulator  sequence  becomes 
monotonically  increasing. 

Example.  The  figure  below  depicts  an  event  template  which  contains  a 
total  entropy  of  27  information  units,  within  a  framework  of  10  constituent 
phenomena.  Suppose  the  criterion  for  suspicion-arousal  is  that  the  instantiated 
entropy  be  greater  than  6  information  units.  There  are  2“I0  *  1024  ways  of  logically 
conjuncting  the  10  constituent  phenomena.  An  Interlisp  search  routine  was 
implemented  to  identify  those  which  fail  to  trigger  suspicion-arousal.  Result:  78  cases 
fail  to  satisfy  the  criterion. 


Figure  3.  One  of  78  Event  Template  Configurations  (out  of  1024)  which  Fails  to  Trigger 
Suspicion-arousal  Under  the  Specified  Constraint. 

A  temporal  cusp  is  defined  to  be  a  point  in  time  when  the 
suspicion-volume  accumulator  sequence  for  one  event  becomes  monotonically 
increasing  (decreasing),  while  concurrently  the  suspicion-volume  accumulator  for 
another  event  becomes  monotonically  decreasing  (increasing). 
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An  event  ej  warrant*  suspicion-confirmation  if  its  suspicion-ratio  exceeds 
a  specified  sufficiency  condition,  or  a  temporal  cusp  favorable  to  ei  is  detected  and  all 
other  events  have  less  instantiated  entropy  than  ej. 


£xe«/se.  Consider  the  message  stream  below  together  with  support  arcs 
pointing  to  events  (the  dashed  lines  represent  phenomenon  revocation,  which  is 
defined  in  the  next  section,  but  for  the  purpose  of  this  example  causes  cancellation  of 
a  support  arc).  Compute  the  suspicion-accumulator,  volume-accumulator,  and 
suspicion-volume  accumulator  sequences  for  events  El  and  E3.  Also  identify  any 
messages  which  cause  a  temporal  cusp. 


Solution. 


timt  *n(ED 
Ml  {.33} 


{.33,33} 

{.33, .33,50} 
{.33, .33, .50,  33} 


vn(Et) 

{10} 

{10,1.0} 

{1.0,1. 0,1.0} 

{1.0,1. 0.1.0, .75} 


{.33, .33, .50,  33, .17}  {1.0, 1.0,1. 0,. 75, .60} 

^  {.33,33,50,25,.  10} 


*n(E3) 

(0,0) 

{00,0.0} 

{0.0, 0.0, .25} 

{0.0, 0.0,  .7.5,50} 
{00,0.0,25,50,50} 


Vn(E3) 

{00} 

(0.0, 0,0} 

(0.0,0, 0,.33} 

{0.0, 0,0,. 33, .50} 
{0.0,0  0,.33,.50,. 60} 


E3:  {0.0, 0.0, .08, .25, .30} 


L*?V*V 


m. 


iip 

1 1  p  f 


m 


$ 

as 


iM 

ftftv 


ft 


Flgura  4.  An  illustration  of  tha  suspicion  accumulator  rnd  voluma  accumulator  saqutncas  for  two 
avants.  Also  shown  aro  tha  suspicion-voluma  accumulators  for  each.  Tha  dashad  linas  indicate 
*trong*sansa  constituent  phanomanon  ravocation  (defined  below).  Massage  M4  causes  a  temporal 
cusp,  together  with  suspicion-confirmation  of  £3  (assuming  that  tha  instantiated  entropy  for  El  is 

diminutive  when  compared  to  that  of  13). 

Note  that  this  theory  of  event  recognition  relies  upon  monotonic 
conditions  induced  by  the  conservation  of  resources  shared  by  events  evolving  in  time, 
and  by  so  doing  abstains  from  decision  based  on  numerical  thresholds.  In  the  example 
above,  suspicion  about  the  existence  of  E3  was  confirmed  with  only  half  its  constituent 
phenomena  instantiated  by  message  evidence,  and  with  an  instantaneous 
suspicion-volume  accumulator  value  of  only  .251 

A  potentially  powerful  technique  to  abduce  an  event  from  message  data 
is  the  occasional  use  of  the  subjunctive  voice  when  attempting  to  logically  instantiate 
the  constituent  phenomena  of  an  event.  It  may  be  the  case  that  several  constituent 
phenomena  become  true  if  the  truth  of  just  one  primitive  clause  is  (for  the  time  being) 
assumed,  even  though  the  message  data  has  not  yet  corroborated  the  primitive  clause. 
Refer  back  to  Figure  1  for  an  instance  of  the  explicit  use  of  the  subjunctive  voice. 


Discounting  Events  which  have  already  Emerged,  Sustained,  and  Decayed  in  Time. 

Much  attention  has  been  paid  in  the  literature  to  deciding  when  an  event 
is  supported  by  evidence.  Equally  important  is  determining  when  an  event  no  longer 
warrants  having  its  constituent  phenomena  maintained  because  of  the  decay  of 
information  over  time.  Currently  automated  systems  are  frequently  incompetent  when 
event  probabilities  reach  a  plateau.  When  such  is  the  case,  a  computer  process  should 
be  capable  of  deciding  whether  the  event  is  continuing  to  progress,  or  has  already 
sustained  and  decayed.  When  an  event  becomes  obsolete,  automated  techniques  are 
required  to  revoke  its  constituent  phenomena  so  that  the  computer's  belief  in  the 
event  is  retracted,  or  at  least  discounted.  The  following  section  describes  a  set  of 
computational  techniques  designed  to  solve  problems  in  this  area. 

A  message  mj  is  said  to  be  revocation-provocative  In  the  weak  sense  with 
respect  to  event  ek  iff  3  some  message  mi  less  timely  than  mj;  x$  <  mj,  mj;  gt  €  ek; 
gt(xs|mj)  ■  1,  and  gt(x5|mj)  »  0.  See  Figure  5 


Figure  S.  Weak-iem#  Phenomenon  Revocation. 


Discussion.  Weak-sense  phenomenon  revocation  may  provide  the  rudiments  for 
automated  non-monotonic  reasoning.  Before  one  may  accomodate  the 
unanticipated,  one  must  be  capable  of  suspending  belief  in  a  previous  state  of  the 
world  by  reasoning  in  the  following  way: 

a)  Some  object  has  obtained  new  spatial  and  temporal  coordinates  which  negate 
belief  in  an  earlier  set  of  coordinates  which  were  accountable  by  some  event; 


b)  No  other  explicitly  modeled  event  contains  constituent  phenomena  capable  of 
explaining  the  new  coordinates. 

Once  an  automaton  demonstrates  a  weak-sense  phenomenon  revocation  capability, 
its  next  logical  step  would  be  to  generate  a  new  event  to  explain  the  coordinates  of 
the  errant  object.  There  is  currently  no  technology  available  to  perform  this  process, 
and  it  is  not  likely  that  there  will  be  for  some  time,  since  a  leap  of  this  magnitude  is 
intrinsically  linked  to  data-driven  templating ,  and  learning  by  discovery. 


A  message  mj  is  said  to  be  revocatlon*provocative  in  the  strong  sense 
with  respect  to  event  ek  iff  3  some  event  e|  different  from  event  ek,  3  some  messsage 
mj  less  timely  than  mj;  x* «  mj,  mj;  gt «  ek;  gu  «  ®i;  with  gt(xjlmj)  ■  1,  gt(xslmj)  ■  0, 
andgu(x$lmj)  ■  1. 
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Flgurt  6.  $trong-*ani#  Phenomenon  Revocation. 


An  event  ej  becomes  stale  if  both  its  suspicion  and  volume  accumulator 
sequences  become  strictly  monotonicaily  decreasing. 


\M 


An  event  ej  warrants  having  its  attributes  revoked  (i.e.,  its  constituent 
attributes  gj  set  to  0)  under  two  conditions: 

i.  A  temporal  cusp  unfavorable  to  ei  is  detected. 

ii.  ej  is  determined  to  be  stale. 
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Event  Stasis  Induced  by  External  Phenomena. 

Under  certain  conditions  the  constituent  phenomena  of  an  event  may 
become  inert  for  protracted  periods  of  time.  In  such  a  situation,  the  event  is  said  to  be 
undergoing  stasis.  Stasis  is  caused  by  the  existence  of  external  phenomena  ( not 
associated  with  the  event)  which  tend  to  force  spatial  immobility  upon  the  objects 
which  (logically  conjoined  with  a  map)  define  the  constituent  phenomena  of  the 
event. 

The  stasis  factor  of  an  event  is  defined  to  be  the  tendency  for  the 
constituent  phenomena  of  an  event  to  remain  inert.  The  stasis  factor  is  computed  by  a 
two-step  process: 

1.  Construct  the  stasis  matrix  as  follows:  for  each  constituent 
phenomenon  (whether  instantiated  or  virtual)  belonging  to  the  event,  assign  a 
probabilistic  estimate  representing  the  certainty  that  there  exist  external  phenomena 
committed  to  any  of  the  following: 

a.  Prolonging  the  constituent  state. 

b.  Transitioning  the  constituent  object(s)  from  the  current  state  to 
one  recently  visited. 

2.  Average  across  all  probabilities  derived  at  step  1. 

Stasis  as  used  here  is  a  state  of  the  world  induced  by  countermeasures, 
and  is  functionally  akin  to  the  result  obtained  by  applying  a  minimax  criterion  utilized 
by  game  theorists. 


Threat  Computation  is  a  Nonsimplistic,  Data-driven  Process. 


The  threat  of  an  event  cannot  be  derived  by  isolating  the  event  from  its 
environment.  An  event  which  unto  itself  seems  threatening  may  in  fact  be  quite 
innoculous  given  that  sufficient  countermeasures  are  brought  into  play.  Other  factors 
which  must  be  utilized  in  the  derivation  of  threat  include  both  the  nature  of  preceding 
events  and  the  potential  impact  of  follow-on  events.  This  section  describes  a 
computational  technique  to  derive  the  threat  of  an  event  based  on  both  the  support 
for  an  event  and  the  countermeasures  at  hand  to  thwart  the  event. 


Figure  7.  An  Event  Emerging  During  the  Decay  of  its  Predecessor  (dashed  area 
indicates  the  region  of  constituent  phenomena  revocation  for  the  first  event). 


Assume  that  an  event  El  has  already  transpired,  and  thot  another  event 
E2  may  be  emerging  Since  the  same  primitive  resources  will  be  utilized  In  event  E2  as 
were  used  in  event  El,  we  expect  to  see  the  computed  subjective  probability  of  event 
E2  rise  at  the  same  time  that  the  computed  probability  of  event  El  starts  to  fall  (see 
Figure  7).  Symbolically,  we  represent  this  as  P(E2!E1),  read  "the  probability  that  E2  is 
emerging  given  that  El  is  decaying".  Earlier  research  focused  on  developing  a 
data-driven  technique  which  lends  itself  to  modeling  the  unsystematic  skewness  of 
events  for  which  message  data  is  providing  asynchronous  clues,  and  against  which 
countermeasures  may  be  progressing  [C2],  The  distribution  of  choice  is  the  Weibull 
distribution,  which  has  density  function: 


for  t  >  0,  15.1] 

elsewhere, 
for  a,  p  >  0. 


f(t)  ■  apt 

-  0 


If  this  function  is  differentiated  with  respect  to  time  end  set  to  zero,  the 
critical  value  of  tis: 

t  ■  j  15.2] 

This  expression  is  significant  because  it  predicts  at  what  value  of  time  (in 
terms  of  a  and  P)  the  distribution  will  peak.  All  that  remains  is  to  couch  the  probability 
of  an  emerging  event  together  with  the  countermeasures  available  to  thwart  the 
event  in  terms  of  a  and  p. 

Computation  of  the  Probability  of  Emerging  Events. 

Let  P(E2IE1)  be  the  probability  that  event  E2  is  emerging  given  that  event 
El  is  decaying  This  probability  is  equal  to  the  quantity  obtained  by  normalizing  the 
suspicion-volume  accumulator  for  E2  with  respect  to  those  for  all  other  events  in  the 
event  set.  This  quantity  is  also  known  as  the  ev/dence  for  event  E2  with  rwsptct  to  the 
reference  c/aw  il ,  or  simply  as  the  ev/dence  for  ovont  E2 

Computation  of  the  Probability  of  Countermeasures  to  an  Emerging  Event 

Let  P(CIE2)  be  the  probability  that  countermeasures  are  available  to  thwart  E2,  given 
that  E2  is  emerging.  This  probability  is  computed  by  noting  the  real  and  virtual 
constituent  phenomena  of  E2,  setting  up  the  stasis  matrix  for  E2,  and  computing  the 
stasis  factor  across  all  phenomena  for  E2 

Make  the  following  substitutions  for  a  and  p  in  equation  5  2: 


a  -  1  /  (1  -P(CIE2)) 
p  -  1  /  P(E2IE1) 


[5.3] 

(5.41 


The  resultant  critical  value  is: 

t  «  (0  *  P(E2IE1))(1  -  P(CIE2))]P(E2,E1)  [5.5] 


Define  the  threat  T«  of  an  event  to  be  equal  to  this  critical  value.  Note  that  threat  is  a 
function  of  probability-valued  functions,  and  is  mapped  to  the  interval  [0,1]. 

Discussion  of  th a  computational  Implications  of  the  thraat  axprasslon:  A  close  look  at 
equation  [5.5]  reveals  that  the  derived  threat  is  polynomial^  related  to  both  the 
support  for  other  events  [called  the  plausibility  of  the  event  under  the 
Dempster-Shafer  formalism],  and  to  the  lack  of  countermeasures  at  hand  to  thwart  the 
event.  However,  threat  is  exponentially  related  to  the  direct  support  for  the  event. 

The  thraat  of  a  message  is  defined  to  be  precisely  equivalent  to  the  maximal  threat  of 
the  list  of  events  whose  constituent  phenomena  are  supported  by  the  message.  Let  Ej 
be  the  event  in  the  event  set  with  the  maximum  instantaneous  suspicion  ratio.  The 
message  threat  is  directly  proportional  to  both  the  evidence  for  Ej  and  the  stasis  factor 
of  E|. 


i 


Filtering  Operations  on  Message  Streams,  and  the  Equivalence  of  Priority  with  Threat. 

As  messages  enter  a  processing  center  for  analysis,  the  sheer  volume  of 
traffic  can  rapidly  generate  a  backlog  which  begs  attention.  It  is  reasonable  to  seek 
automated  assistance  in  ordering  the  queue  based  on  the  priority  of  the  messages,  so 
that  the  most  time-critical  items  are  presented  first.  Queuing  based  solely  on  either 
message  time  of  arrival  or  message  timeliness  is  inappropriate  because  the  threat  of 
events  for  which  the  messages  provide  evidence  must  be  brought  into  play 
Regrettably,  threat  is  a  context-sensitive  process,  and  must  be  painfully  derived  by 
abduction  of  events  from  the  message  data.  The  following  section  develops  two 
theorems  which  show  respectively:  a)  the  number  of  ways  to  order  a  stream  of  n 
messages;  b)  the  number  of  feasible  filtering  solutions  on  a  stream  of  n  messages. 

A  message  stream  is  a  queue  of  messages  ordered  chronologically  by  time 
of  arrival  in  the  queue. 

A  time-ordered  queue  is  a  message  queue  sorted  by  timeliness  of  the 
individual  messages. 


A  filtering  of  a  message  stream  m  is  a  permutation  based  on  ordering  m 
as  a  monotonically  decreasing  function  of  threat. 


A  coarse  threat  quanthation  scheme  on  a  message  stream  m  of  n 
messages  is  a  partition  of  m  into  k  threat  classes  such  that  every  message  contained  in 
m  is  assigned  to  exactly  one  of  the  k  classes 


Theorem  1 .  Number  of  Possible  Ways  to  Order  a  Stream  of  n  Messages. 


There  are  nl  ways  to  order  a  message  stream  of  length  n. 

Proof.  Since  a  filtering  is  a  permutation  on  n  objects,  there  are  nl  ways  to 
order  a  message  stream. 


Definition.  A  feasible  filtering  solution  is  a  filtering  in  which  every  message  is 
correctly  assigned  to  a  threat  class  by  a  coarse  threat  quantization  scheme. 
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Theorem  2  Number  of  Feasible  Filtering  Solutions  on  a  Stream  of  n  Messages. 

Let  P  be  a  coarse  threat  quantization  scheme  on  a  message  stream  of 
length  n  into  threat  classes  {Ci,  C2,  Ck}.  Let  ICjl  denote  the  order  of  class  Q.  Then 
the  number  of  feasible  filtering  solutions  is  equal  to  TT  ;C|I!,  i  ■  1,k;  with  Z  ICjl  *  n. 

Proof.  Let  Cj  be  an  arbitrary  threat  class.  Then  the  number  of  ways  to 
order  ICjl  messages  within  the  class  is  ICjl!.  Over  all  k  threat  classes,  the  number  of 
possible  orderings  is  ICi  I f  *  IC2II  *  ...  ICkH  *  n  ICjl!,  i  *  1,k. 

Example.  Below  is  a  diagram  showing  9  messages  coarsely  quantized  into 
5  levels  of  threat.  Theorem  1  asserts  91  *  362,880  possible  orderings  on  this  message 
stream.  Theorem  2  says  that  this  number  can  be  reduced  to  1 !  *  1 1  *  31  *  3!  *  1 !  »  36 
feasible  filterings. 


Message  Processing  Sources  of  Error  and  the  Potential  for  Recovery. 

Messages  may  contain  two  distinct  data  structures:  a  statistical  feature 
vector,  and  an  excerpt  of  language  uttered  by  a  human  being  who  in  some  way 
interacts  with  the  object  characterized  by  the  feature  vector.  Machine  processing  of 
messages  therefore  involves  comparing  and  contrasting  feature  vector  data,  together 
with  natural  language  processing.  These  two  types  of  reasoning  processes  are 
sufficiently  diverse  that  mainstream  iechnology  thrusts  in  each  area  have  been 
pursued  in  parallel  for  several  decades,  with  one  thrust  being  in  the  statistical  pattern 
recognition  arena,  and  the  other  in  computational  linguistics.  Both  technologies 
continue  to  produce  new  research,  and  each  suffers  from  its  own  peculiar  form  of 
error.  It  is  instructive  to  play  the  devil's  advocate  and  construct  a  taxonomic  error  tree, 
which  graphically  portrays  the  ways  in  which  an  automated  message  processing  system 
may  be  fooled,  either  by  errors  in  the  message  data,  or  by  faulty  reasoning  about  the 
data: 


Figure  9.  Message  Processing  Error  Forms  and  Recovery  Techniques. 


Feature  vector  error  is  generally  attributable  to  measurement  error  of  the 
sensor  which  gave  rise  to  the  feature  values,  but  can  also  occur  during  statistical 
testing  because  of  faulty  modeling  Due  to  the  limited  sensitivity  of  the  sensor 
working  within  the  constraints  of  terrain  and  other  sources  of  interference,  any 
particular  attribute  value  must  be  characterized  by  an  error  ellipse  probable  (EEP), 


with  the  semi-major  axis  along  the  perceived  line-of-bearing,  and  the  semi-minor  axis 
sweeping  across  the  arc  described  by  the  angular  resolution  of  the  direction-finding 
capability  of  the  sensor. 

However,  there  is  also  an  error  associated  with  modeling  the  statistical 
distributions  of  the  geodynamic  objects  which  the  sensors  are  attempting  to  measure. 
It  may  be  that  an  object’s  location  is  inappropriately  characterized  by  a  normal 
distribution,  whereas  if  the  probable  direction  of  movement  is  known  a  priori,  it  may 
behoove  the  system  modeler  to  utilize  some  distribution  which  is  conveniently  skewed 
in  the  direction  of  the  motion.  Conversely,  if  it  is  known  that  an  object  is  currently 
stationary,  it  is  advisable  to  ensure  that  the  distribution  used  for  modeling  possesses  a 
bell  shape. 

Yet  another  source  of  error  when  performing  statistical  tests  with  feature 
vector  data  is  the  problem  associated  with  hardwiring  a  statistical  level  of  significance 
to  a  particular  test.  A  Boolean  decision  is  made  about  the  null  hypothesis  based  on  the 
outcome  of  this  test.  For  example,  it  may  be  the  case  that  a  test  of  means  fails  at  the 
.95  confidence  level,  and  therefore  the  null  hypothesis  is  rejected  out  of  hand.  A  less 
biased  approach  would  be  not  to  make  a  deterministic  decision  about  the  truth  of  the 
null  hypothesis,  but  rather  post  an  indication  of  how  well  the  test  was  passed,  or  what 
level  of  significance  would  guarantee  that  the  test  is  passed. 

The  worst-case  branching  factor  of  introducing  two  levels  of  relaxation 
into  statistical  testing  is  m  X  n,  where  m  is  the  number  of  distributions  used  to  model 
phenomena,  and  n  is  the  number  of  levels  of  significance  over  which  the  tests  are 
conducted.  Knowledge-based  statistical  testing  permits  an  intelligent  ordering  of  the 
tests,  so  that  the  most  likely  distribution  (based  on  data-driven  knowledge  about  the 
phenomenon)  is  selected  to  be  checked  first.  For  example,  a  check  for  a  moving 
object's  location  may  pass  a  chi-square  test  of  means  at  the  .95  confidence  level,  yet 
not  pass  a  Gaussian  test  until  the  level  of  significance  is  dropped  to  a  .50  level.  The 
more  powerful  the  search  knowledge,  the  iess  costly  the  relaxation  process.  When  the 
data  is  well-modeled  and  sensor  measurement  error  is  at  a  minimum,  a  cost  of  1  is 
enjoyed,  since  the  appropriate  distribution  is  selected  immediately,  and  the  highest 
confidence  level  test  of  means  (for  the  given  distribution)  is  passed 

Because  any  statistical  test  of  a  null  hypothesis  will  be  passed  (no  matter 
what  the  distribution)  if  the  confidence  level  is  sufficiently  low,  it  is  not  prudent  from  a 
decision-theoretic  standpoint  to  use  a  depth-first  search  during  the  two  levels  of 
relaxation,  Instead,  it  makes  sense  to  start  with  a  high  confidence  level,  breadth-wise 
test  across  an  intelligently  ordered  menu  of  distributions  for  acceptance  of  the  null 
hypothesis,  and  then  decrement  to  a  lower  confidence  level  if  all  tests  are  failed. 


Figure  10.  Relaxing  Both  Distribution  Constraints  Together  With  levels  of  Significance  to  Enhance 

Statistical  Tatting. 


Natural  language  processing,  independent  of  the  particular  grammar 
used,  also  is  subject  to  different  forms  of  error.  The  problems  of  ambiguous  words, 
anaphora,  ellipsis,  and  prepositional  phrase  attachment  are  four  areas  which  continue 
to  produce  thesis-quality  research.  Parsers  work  with  sets,  whether  they  be  sets  of 
parts  of  speech,  sets  of  case  frames  for  verbs,  or  sets  of  semantic  primitives.  In  some 
form  or  another,  whether  syntactic  or  semantic,  all  possible  words  and  actions  are 
partitioned  into  cells  (lexicons),  each  of  which  represents  some  generalized  concept 


about  a  grammar,  or  more  generally  about  the  world.  Error  in  the  most  fundamental 
sense  can  occur  in  two  ways,  just  as  in  statistical  testing :  a  string  may  fail  to  be  inserted 
into  its  proper  cell;  or  it  may  be  inserted  into  an  improper  cell. 

If  a  string  of  natural  language  fails  the  set  membership  test  for  any 
lexicon  during  processing,  and  the  string  is  in  fact  appropriate  to  the  target  domain, 
then  one  of  two  alternative  hypotheses  may  be  true:  either  the  system  designer  failed 
to  install  the  string  into  the  appropriate  lexicon  during  the  knowledge  engineering 
phase,  or  the  string  may  be  mispelled.  In  the  former  case,  an  intelligent  natural 


language  parser  may  be  able  to  use  context  to  deduce  the  grammatical  class  of  the 
string  (e  g.,  it  is  frequently  possible  to  guess  that  a  test  string  is  a  location).  If  on  the 
other  hand  the  string  is  misspelled,  it  may  be  computationally  feasible  to  recover  if  the 
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trror  is  not  too  serious.  The  following  table  enumerates  the  number  of  ways  that 
generic  types  of  string  error  may  occur  during  transmission,  followed  by  the  number 
of  strings  which  a  machine  must  brute-force  generate  to  guarantee  recovery. 


Some  Preliminary  Results  Regarding  n-characterl  String  Error  Recovery 


Type  Error 
transposition 


Possibilities2 


Recovery  Combinatorics2 


n- 1 


n  •  1 


k-extra-letters 


k-dropped-lttttrs 


(r.  ♦  k)Ck  *  2fik 
nCk 


(rt  ♦  k)Ck 
nCk*26l< 


k-wrong-letten 


25k 


nCk*2Sk 


k1-drops-and-k2-add»4  nCk1  *  (n-kUk2)Ck2 b  26k2  (n-  kl  ♦  k2)Ck2  *  (n  +  k2)Ck1  *26^1 


k2«adds-and-k1  -drops4  (n  +  k2)Ck2  *  26^2  *  (n  f  k2)Ck1  ("  ♦  k2)Ckl  *  26^1  *  (n  «.  k2)Ck2 


*  Auuming  for  didactic  ration  that  a  chtracttr  it  a  mtmbcr  of  tha  Engluh  alphabat 
J  Tha  nombar  of  wayi  that  tha  arror  can  happan  in  tha  world 

3  Tha  rumbtr  of  itnngi  which  a  machint  mult  ganarata  to  guartwrt  racovary. 

4  Tha  procaiiaiof  dropping  and  adding  art  obviouily  notcommutstiva. 


Implications  of  the  String  Error  Combinatorial  Expressions. 


All  string  error  can  be  explained  in  terms  of  linear  combinations  of  added 
or  dropped  characters.  From  the  above  table,  it  can  be  seen  that  guaranteed  recovery 
from  errors  of  the  type  indicated  in  the  last  four  rows  requires  an  algorithm  of 
exponential  complexity,  since  an  exponent  appears  in  the  recovery  wombinatorics 
column.  It  has  been  shown  elsewhere  [G1]  that  for  fixed  source  and  destination  strings 
and  a  finite  number  of  operations,  that  the  destination  string  can  be  derived  from  the 
source  string  in  polynomial  time,  given  that  characters  are  corrected  one  at  a  time 
rather  than  in  groups  of  k. 
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Conclusions. 


Due  to  context-sensitivity,  the  topic  of  message  filtering  cannot  be 
broached  without  addressing  the  more  fundamental  problem  of  recognizing  events 
pointed  to  by  message  evidence.  To  this  end,  a  formal  theory  of  event  recognition  is 
being  developed,  complete  with  a  treatment  of  the  computational  aspects  of 
implementation.  Formal  definitions  have  been  developed  for  such  concepts  as 
constituent  phenomena,  suspicion-arousal,  suspicion-confirmation,  weak  and 
strong-sense  event  revocation ,  event  stasis,  and  the  threat  of  an  event.  Combinatorial 
expressions  have  been  derived  for  the  number  of  feasible  ways  to  filter  a  stream  of  n 
messages;  the  branching  factor  introduced  by  permitting  both  distribution-level  and 
confidence-level  relaxation  during  statistical  tests  of  means;  and  the  number  of 
machine-generated  strings  necessary  to  guarantee  recovery  from  generic  forms  of 
string  error  encountered  during  natural  language  processing. 

Future  Directions  of  the  Research. 

Work  shall  continue  on  developing  a  coherent  theory  to  explain 
message-driven  event  recognition,  with  the  ultimate  goal  being  to  filter  a  stream  of 
messages  which  are  providing  clues  to  the  events.  Although  the  work  thus  far  has 
striven  to  explain  how  a  human  decision  maker  suspects  and  confirms  hypotheses 
while  handicapped  with  sparse  data,  the  theory  remams  flawed  because  it  is 
incomplete.  New  work  shall  focus  on  an  epistemology  of  reasoning  with  the 
constituent  phenomena  which  comprise  an  event.  Currently  driving  the  research  is  the 
realization  that  a  human  problem  solver  frequently  tests  the  truth  of  an  unsupported 
clause  belonging  to  a  constituent  phenomenon  by  posing  it  in  the  subjunctive  voice, 
because  by  so  doing  the  truth  of  a  significant  portion  of  the  other  constituent 
phenomena  may  be  induced,  especially  when  they  were  for  all  intents  and  purposes 
already  true  but  for  the  lone  dissension. 

Implementation  Issues. 

The  objects  characterized  by  feature  vector  data  in  many  applications 
may  be  represented  by  a  taxonomic  hierarchy  of  semantic  activities.  To  limit  search,  a 
message  router  has  been  developed  in  Interlisp  to  extract  the  list  or  possible  activities 
alluded  to  by  a  message.  The  generic  Conceptual  Structures  Representation  Language 
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(CSRL)  developed  by  Ohio  State  University  [B 1 )  is  being  utilized  as  a  rapid  prototyping 
tool  to  further  process  the  message  by  invoking  the  set  of  specific  functional  parsers 
pointed  to  by  the  router  list.  The  natural  language  system  must  of  necessity  be  Type  C, 
which  means  that  the  beliefs  and  intentions  of  the  communicators  are  taken  into 
account  IH2],  As  such,  recent  research  on  planning  [G3,  K1,  LI]  is  being  investigated  to 
bolster  the  Type  C  NLP  knowledge  base,  and  to  enhance  the  control  of  the  parsers. 
Since  an  event  is  defined  in  terms  of  constituent  phenomena,  which  are  themselves 
defined  in  terms  of  a  map,  spatial  representation  of  the  objects  is  crucial.  There  has 
been  some  commendable  work  undertaken  to  represent  the  relative  positions  of 
objects  described  with  natural  language  {HI],  but  much  remains  to  be  done,  especially 
in  bringing  such  a  spatial  configuration  together  with  the  absolute  description 
conveyed  by  a  map.  A  companion  document  is  in  preparation  to  describe  the 
implementation  which  is  currently  underway. 
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ABSTRACT 

The  p-value  In  a  hypothesis  test,  which  Is  a  well  established  and  useful 
although  not  universally  used  statistic,  may  be  supplemented  with  q-values. 
Each  q-value,  just  like  each  possibly  designed  value  of  the  Type-II  risk 
universally  denoted  by  6,  corresponds  to  a  possible  value  of  the  tested 
parameter.  The  algorithm  for  calculating  q-values  Is  the  same  as  for 
calculating  fl's;  the  Inputs  that  yield  a‘s  Include  the  Type-I  risk,  which  Is 
universally  denoted  by  a,  and  a  planned  number  of  measurements  ( 1 . e . ,  planned 
sample  size).  The  corresponding  Inputs  that  yield  q-values  are  the  p-value 

and  the  actual  number  of  measurements  (l.e..  available  samplg  size).  Thus, 
the  q-values  contain  post-test  Type-II  risk  Information  In  the  same  manner 

that  the  p-value  contains  post-test  Information  about  the  Type-I  risk. 

By  using  a  q-value  which  corresponds  to  a  particular  unacceptable  value 
of  the  tested  parameter,  different  criteria  can  be  established  for  the 
rejection  of  the  null  hypothesis.  Three  alternate  criteria  Imply  rejection  If 
(1)  (q-value/8),  (2)  (q-value/B)/( p-value/a) ,  or  (3)  (q-value/p-value)  Is 
greater  than  unity.  The  use  of  any  of  these  three  would  be  a  radical 

departure  from  the  traditional  rejection  when  l/(p-value/o)  Is  greater  than 
unity.  The  (q-value/p-value)  criterion  Is  Independent  of  a,  $ ,  and  the 

planned  sample  size  because  both  the  p-value  and  q-value  depend  only  on  the 
results  of  experimental  measurements.  All  three  of  these  alternate  criteria 

Comnents  by  panelists  Drs.  Kaye  Basford  and  K.  T.  Federer  are  at  the 
eno  of  this  artlcal. 


lead  to  trends  In  critical  region  size  which  differ  from  the  trend  resulting 
from  the  traditional  criteria.  Replacement  of  the  traditional  rejection 
criterion,  with  one  of  the  proposed  alternate  criteria  or  a  decision  procedure 
Incorporating  rational  from  the  alternate  criteria,  could  significantly 
Influence  government  and  contractor  relations  and  the  products  or  services 
Involved. 

1.  Introduction.  Hypothesis  testing  Is  a  widely  used  procedure  for  designing 
and  conducting  experiments  to  evaluate  a  parameter  against  a  standard.  In 
government-contractor  relations,  the  government  sets  the  standard.  The 
acceptability  of  the  contractor's  product  or  service  Is  often  determined  by  a 
hypothesis  test. 

a.  The  basic  procedure  Is  to; 

(1)  Formulate  a  null  hypothesis,  H0,  relating  a  parameter,  e,  to  a 
standard,  eQ,  and 

(2)  Reject  H0  only  If  there  Is  sufficient  experimental  evidence 

that  the  assumption  Is  unlikely. 

The  null  hypothesis  In  government-contractor  relations  Is  usually  the 
assumption  that  the  product  or  service  meets  the  specification.  The 

traditional  basis  for  rejection  of  H0,  stated  In  terms  of  a  statistic  which 

Is  Increasingly  being  reported  and  Interpreted,  Is  that  the  p-value  Is  too 

small . 

b.  The  p-value  Is  defined  as  the  probability  of  an  additional  experi¬ 

mental  result  as  unlikely  as  the  data.  It  Is  a  function  of  two  properties  of 
the  data: 

(1)  The  used  sample  size,  nu,  which  Is  the  actual  number  of 


measurements  and 


(2)  Either  the  measurements  or  their  ranks. 


It  Is  also  a  function  of  a  third  factor: 

(3)  The  distribution  of  all  possible  measurements  under  the 
assumption  that  H0  Is  true. 

The  traditional  rejection  criterion,  written  In  a  slightly  obscure  manner 
which  Is  In  the  same  format  as  alternate  rejection  criteria  proposed  below,  Is 
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where  p  Is  the  p-value  and  a  Is  a  predetermined  probability  of  the  Type-I 
risk  or  the  contractor's  risk.  This  Is  the  risk  that  the  contractor's  product 
or  service  meets  the  standard  but  will  be  rejected  by  the  hypothesis  test. 
Rejection  when  the  p-value  Is  too  small  Is  justified  by  an  Insistence  that  the 
contractor  will  take  a  reasonable  risk. 

c.  The  p-value  provides  one  aspect  of  post- test  Information,  Statistics 

called  q-values  described  the  other  viewpoint.  For  the  Introduction  of 
q-values,  see  "Proposed  Additional  Inferential  Information  During  and  After 
Hypothesis  Testing",  Procedlngs  of  the  Thirtieth  Conference  on  the  Design  of 

Experiments  In  Army  Research,  Development,  and  Testing,  Paul  H.  Thrasher, 

1984. 

d.  Before  data  Is  taken,  the  Type-I  risk  Is  supplemented  by  the  Type-II 

or  government's  risk.  This  risk,  denoted  by  8,  Is  the  probability  of 
Incorrectly  falling  to  reject  H0.  It  Is  the  companion  risk  to  a,  since  a 
Is  the  probability  of  Incorrectly  rejecting  H0.  Since  there  are  many  values 
of  e  for  which  HQ  Is  false  and  the  alternate  hypothesis  denoted  by  Ha  Is 

true,  there  are  many  8's.  Each  t  a  function  of: 

(1)  a  and 
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(2)  The  planned  sample  size,  np,  which  Is  the  planned  number  of 
measurements. 

The  B's  differ  from  one  another  because  each  Is  also  a  function  of 

(3)  A  specific  value  of  e  which  Is  not  equal  to  or  better  than  e0. 
If  one  of  these  unacceptable  parameters,  denoted  by  eu,  is  of  particular 
Interest,  then  It  Is  meaningful  to  concentrate  on  one  B. 

e.  After  the  data  Is  taken,  the  p-value  Is  supplemented  by  q-values.  The 
same  algorithm  used  to  calculate  6  from  a,  np,  eu,  H0,  and  Ha  may  be  used 
to  find  a  q-value.  A  q-value  calculation  differs  from  a  B  calculation  In 
that 

(1)  The  p-value,  Instead  of  the  original  value  of  o,  and 

(2)  nu,  whether  or  not  this  Is  equal  to  np, 
are  used  In  the  algorithm.  Use  of 

(3)  The  same  value  of  the  parameter  eu  and  the  same  hypotheses,  that 
were  used  In  the  calculation  of  B,  permits  direct  comparison  between  a  and  a 
q-value.  A  q-value  tends  to  be  greater  than  a  planned  value  of  6  If 
either  nu  <  np  or  the  p-value  <  a.  Similarly,  making  nu  >  np  or 
obtaining  data  whose  p-value  >  a  tends  to  yield  a  q-value  smaller  than  the 
original  value  of  a. 

2.  Alternate  Rejection  Criteria.  The  traditional  rejection  criterion  Is  well 
established.  It  Is  not  however  the  only  rational  decision  technique. 

a.  Instead  of  requiring  the  contractor's  risk  not  be  too  low,  one 
alternate  Is  to  require  that  the  government's  risk  not  be  too  high.  This 
argument  replaces  the  traditional  rejection  criterion, 


with  the  first  alternate  rejection  criterion: 


The  use  of  this  alternate  rejection  criteria  naturally  requires  that  an 
unacceptable  paramenter,  eu,  must  be  set  along  with  the  standard,  e0,  This 
first  alternate  criteria  shifts  the  emphasis  completely  away  from  the  Type-I 
risk  to  the  Type-1 I  risk. 

b.  A  second  alternate  rejection  criterion  which  considers  both  the  Type-1 
and  Type-II  risks  Is  to  reject  If 


This  result  may  be  obtained  by  multiplying  the  traditional  and  the  first 
proposed  alternate  criteria. 

c.  A  third  alternate  rejection  criterion  which  concentrates  entirely  on 
the  post-test  Information,  by  considering  only  the  p-value  and  a  q-value  while 
Ignoring  the  specific  values  a  and  6,  Is 


When  the  limiting  ratio  of  the  post-test  Type-II  risk  to  the  Type-I  risk,  R, 
Is  set  equal  to  one,  rejection  occurs  under  this  criterion  If  the  government's 
risk  exceeds  the  contractor's  risks.  Other  values  of  R  may  be  used  to  design 
a  test  with  other  relative  emphasis  on  the  government's  and  contractor's 
risks.  This  criterion  considers  the  ratio,  R  ■  6/a,  Instead  of  considering  a 
and  6  separately. 

d.  The  traditional  rejection  criterion  and  the  three  alternate  rejection 
criteria  Introduced  above  have  contradictory  and  Incomplete  attributes. 


No  single  criterion  will  provide  a  panacea  for  all  situations.  For  example, 
the  third  alternate  criterion  may  be  appropriate  when  large  values  of  a  and  b 
fortuitously  cause  no  large  financial  or  logistic  difficulties  (e.g.,  when  the 
contractor  can  easily  rework  rejected  Items  and  the  government  can  feasibly 
replace  Items  not  functioning  properly).  If  either  a  or  e  must  be  small 
however,  the  third  alternate  may  be  Inappropriate  (l.e.,  setting  R  may  not 
provide  the  desired  values  of  a  or  8).  In  this  case,  the  second  alternate  may 
be  desired  or  perhaps  a  simultaneous  application  of  the  traditional  and  first 
alternative  criteria  may  be  warranted.  Satisfying  the  second  alternate 
criterion  does  not  gurarantee  that  the  traditional  and  first  alternate 
criteria  are  simultaneously  satisfied.  All  of  the  criteria  must  be  scrutin¬ 
ized  Individually.  Each,  or  each  combination,  must  be  justified  or  discarded 
on  the  basis  of  Its  own  characteristics.  Only  one,  or  one  combination,  can  be 
used  In  any  particular  hypothesis  test, 

3.  Critical  Region*  In  One  Example,  The  critical  regions,  defined  as 
Intervals  In  which  data  Implies  rejection  of  H0,  may  be  found  for  any 
situation  In  which  traditional  hypothesis  testing  Is  done.  The  specific 
situation  used  In  this  section  Is  one  used  In  the  previously  referenced 
presentation  at  the  Thirtieth  Conference  on  the  Design  of  Experiments. 

a  * 

Basically,  this  situation  has  a  standard,  oQ  ,  and  as  unacceptable  level,  , 
for  the  variance,  o\  of  a  random  variable  which  is  assumed  normal.  The 
Chi-squared  distribution  then  describes  (n-1)  sVo*  where  s*  Is  the  sample 
variance. 

a.  This  example  yields  the  critical  regions  plotted  In  figures  1 
through  13.  For  this  example  at  least,  the  trends  In  the  critical  regions  of 
the  proposed  alternate  criteria  are  significantly  different  than  those  of  the 


traditional  criterion.  Two  very  evident  trends  are  seen  by  looking  at  the 
lower  ends  of  the  critical  regions  which  are  called  the  critical  points.  For 
the  traditional  criterion  with  reasonably  low  values  of  a,  the  critical 
points, 

(1)  All  correspond  to  measurements  better  than  the  standard  and 

(2)  Decrease  as  the  sample  size  Increases. 

b.  For  some  situations  Involving  the  alternate  criteria,  the  critical 
points 

(1)  Correspond  to  measurments  worse  than  the  standard  and 

(2)  Increase  as  the  used  sample  size  Increases. 

c.  Both  of  these  properties  are  naturally  disturbing  to  hypothesis 
testers  who  are  used  to  the  normal  criterion  with  low  values  of  a.  However, 
both  actually  occur  In  the  traditional  criterion  when  the  value  of  a  Is  made 
large  enough. 

d.  The  figures  describing  q/fl  >  1  and  (q/B)  /  (p/a)  >  1  are 

much  more  complicated  than  those  describing  1  /  (p/a)  >  1.  However,  the 
figure  describing  q/p  >  1  Is  as  simple  as  the  figures  describing 
1  /  (p/n)  >  1.  This  occurs  because  both  p  and  q  are  Independent  of  o, 

np,  and  B. 

4.  Generalizations,  Extensions,  and  Applications. 

a.  In  figures  1  through  13,  there  Is  an  Inversion  of  trends  between  the 
traditional  criterion  and  any  choice  of  alternate  criterion.  This  appears  to 
be  a  general  property  for  this  particular  hypothesis  test.  Much  theoretical 
and  simulation  work  needs  to  be  done,  \  -er,  before  extending  this  statement 
to  other  hypothesis  tests. 


b.  If  any  of  the  alternate  criteria  are  applied  In  government-contractor 
relations,  significant  changes  will  occur  In  the  way  business  Is  done.  It 
would  be  entirely  possible,  for  example,  that  contractors  would  be  In  the 
governments's  present  position  of  wanting  an  Increased  sample  size.  Using  the 
traditional  criteria,  the  government  Is  vulnerable  when  nu  <  np;  using  an 
alternate  criteria,  the  contractor  may  feel  vulnerable  when  this  change  In  the 
planned  number  of  tested  Items  occurs.  This  Inversion  Is  certainly  signi¬ 
ficant.  It  could  lead  to  a  significant  decrease  In  cost  and/or  Increase  In 
quality  of  products  or  services  that  the  government  procures  from  contractors. 

c.  A  secondary  benefit  from  using  any  of  the  alternate  criteria  Is  that 
the  government  would  be  forced  to  specify  an  unacceptable  parameter  as  well  as 
a  standard.  This  requirement  would  yield  an  Improvement  in  management. 

d.  The  choice  of  a  criterion  or  perhaps  a  set  of  simultaneous  criteria 
for  any  situation  must  consider  the  costs  of  production,  testing,  and  use  of 
the  product  or  service.  This  consideration  will  undoubtaDle  be  complicated 
and  many  faceted.  The  measurement  of  cost  may  not  even  be  straightforward, 
(e.g.,  dollars,  time,  lives,  and  military  success  may  be  competing  measures  of 
cost.)  Nevertheless,  the  total  cost  should  be  minimized  by  a  selection  from 
the  possible  rejection  criteria. 
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Figure  1.  "normal"  relationship  between  number  of  measurements,  n,  and 


minimum  sample  variance,  s 


>m1n,  which  Implies  rejection  of 


the  null  hypothesis  that  the  variance  Is  at  least  as  small 

i  i 

as  a  standard,  o  .  This  "normal"  decrease  In  s  ,  with 

min 

an  Increase  In  n  results  from  using  a  rejection  criterion 


based  on  the  ratio  of  attained  to  planned  Type-1  risks, 
p-value/a,  for  commonly  used  values  of  o. 


SOLD  LINES  MARK  REGIONS  OF  Q/&  >  1  WHEN  a=  .1 


nBs  s 


10 


1A 


7 


jOOU 


11 

2.1  C-5 

i 

i  ~**~* 

i  '  ■  1  1 


P;*' 

i 

i 

i 

« 

»  1  1  1 


4)17 


JOOU 

» 

<  ““““ 

J 


.10 


j053 


< 


J  i— L 


j9-  .76 


35 


456 


*Voi  j  '  i  '  s  i 


111111  L 


15 


1  1  »-«-  1  1  I  I  1  1  U  111111  L 


1.1  E-4 


j  |,  i  I  1.1.  i  I  J 


jOIO 


Jfi 


10 


■l,.l  II  L 


0.1  E-0 


J-J  L 


i-UJ 


,1-1.  LI 


A2 


_l _ L 


r\it 

10 

ii 

3 


i 

Figure  2.  "Inverted  relationship  between  sm1n  and  used  number  of 
measurements.  nu,  resulting  from  alternate  criterion  based 
on  ratio  of  attained  to  planned  Type-II  risks,  q-value/3. 
This  "Inverted"  Increase  In  $m1n  with  an  Increase  In  nu 
occurs  for  a  •  0.01  and  several  combinations  of  planned 
number  of  measurements,  nn,  and  discrimination  ratios  of 

r  11 

unacceptable  to  standard  variances,  a  /  a  , 
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Figure  6.  'Independence*  of  s  , 

min  u  o 

This  "almost  constancy  "  recurs  when  the  criterion  If  based  on 
the  ratio  of  the  attalm.d  Type-II  to  attained  Type-I  risks, 
q-value/p- value,  which  1 <  Independent  of  a,  s,  and  n^. 


W 


SOLD  LINES  MARK  REGIONS  OF  Q/P  >  1/2 


•VC'.'t  011145 


Figure  7. 


"Ulmlnshed  Independence"  of  sm1n  end  n  for  low 
ou  /  oQ  end  lerge  n.  This  "not  quite  constency"  occurs 
when  the  q-velue/p-velue  criterion  uses  1/2  Insteed  of  1  es 
the  stenderd.  Rejection  Is  possible  from  point  estlmetes  of 

a  i  it 

s  less  then  when  both  o .  /  o_  end  n  ere  low. 


SOLID  LINES  MARK  REGIONS  OF  Q/P  >  2 


af/aii 


n:  19 
17 
15 
13 
11 
9 
7 
5 
3 


«*/o r*t  0  1  2  3  4  5 


Figure  8. 


i  t 

0..  /  ° 
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and  large  n.  This  "alaost  constancy"  Is  closer  to  actual 
Independence  when  the  q-value/p-value  criterion  uses  2 
instead  of  1  as  the  standard. 
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Figure  9.  "Dlulnlshad  Indapandanea"  of  s^n  and  n  for  lou  \  I  \ 
anti  largo  N.  "Slight  normalcy*  aa  opposed  to  "vary  flight 
Inversion"  remits  uhen  tl»  q-valut/p- value  crlterlom  urns 
5  Instead  of  2  at  the  standard. 
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Figure  12.  Inverted 


Inverted  sm1n  -  nu  relationship  with  (q-value/8)/ 
(p-value/a)  criterion  end  «  *  0.4.  This  "Inversion" 
strongly  predominates  when  a  high  a  Is  used  In  this 
criterion.  Again,  high  a  and  low  t  {or  high  np)  allow  a 
possibility  of  rejection  if  the  point  estimate  of  s*  Is  less 

I 

than  o  . 
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Figure  13.  Combined  Illustration  of  sm1n  -  n  relationships  when 

q-v«lue/p- value  alternate  criterion  Is  used  with  dlffe-ent 

standards.  R.  This  complllatlon  of  figures  6  throe jh  10 

shows  more  "Independence-  between  s*1n  and  n  (and  also  o. 

B.  and  np)  than  the  traditional  p~va1ue/a  criterion  or  the 

alternate  q-value/S  or  (q-value/e)/p-value/«)  criteria. 

Also,  the  possibility  of  rejection  when  s*  <  o*  occurs 
only  when  R,  n,  and  ou*  /  0g  are  small. 


COMMENTS  BY  PANELISTS  OR.  KAYE  BASFGRD  AND  PROFESSOR  W.  T.  FEDERER 

ON  THE  FOLLOWING  ART1CAL 


Uw  of  the  P -value  and  a  Q-value  in  rejection  criteria  by 
Paul  H.  Thrasher 

U.S.  Amy  Whit*  Sand*  Missile  Ranga. 

Thi*  pa  par  %•*  not  pratantad  but  MB  I  rooeived  a  oopy  of  tha 
paper  *ub*aqu»ntly. 

Zaya  Bat  ford  i  This  is  a  vary  interesting  paper  in  viiich  it  is 
suggested  that  tha  usual  type  1  error  (a)  used  to 
aooept  or  reject  tha  nuii  hypothesis  be  supported  by 
son*  information  on  the  Type  II  error  (0).  Hence 
instead  of  a  decision  being  made  solely  on  tha 
p -value,  the  q -value  would  also  be  used. 

Dr.  Thrasher  suggested  three  alternative  criteria  for 
reject  ton  of  tha  null  hypothes  and  studied  their 
behavior  for  one  particular  test.  The  general 
properties  of  these  decision  criteria  need  to  be 
investigated  for  hypothesis  tests  with  different 
underlying  distributions.  Only  then  oculd  a 
reoonwendation  be  made  on  the  desirability  and 
feasibility  of  introducing  such  a  or Iter i  an.  This  is 
a  challenging  research  project  vtfiich  I  hope  will  be 
taken  up  in  the  near  future. 


INCORPORATING  FUZZY  SET  THEORY  INTO 


STATISTICAL  HYPOTHESIS  TESTING 


William  E.  Baker 
Probability  and  Statistics  Branch 
Ballistic  Research  Laboratory 
Aberdeen  Proving  Ground,  Maryland  21005 


ABSTRACT 


In  many  instances  the  data  used  in  statistical  hypothesis  testing  may  be  vague  or 
imprecise  and,  as  such,  may  suggest  results  that  arc  incorrect.  Rank  tests,  in  particular, 
seem  susceptible,  since  the  original  data,  once  ranked,  has  no  further  influence  on  the 
testing  procedure  no  matter  how  closely  they  are  grouped.  A  possible  solution  is  to 
treat  the  ranks  as  fuzzy  integers  represented  by  membership  functions  that  indicate  the 
degree  to  which  each  rank  assumes  each  integer  value.  In  this  paper,  a  method  is 
suggested  for  obtaining  these  membership  functions;  and  the  concept  is  incorporated 
into  an  existing  rank  te3t.  An  application  of  this  fuzzy  hypothesis-testing  procedure  is 
provided. 
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I  INTRODUCTION 


Suppose  we  have  the  following  set  of  data: 

{-0.888,  0.200,  -1.000,  -0  417,  -0.052,  0.188,  0.067,  -0.467,  -0.623,  -0.181}  .  (1) 

By  considering  their  absolute  values,  we  obtain  a  set  S  consisting  of  ordered  pairs, 

S  =  {(1,  -0.052),  (2,  0.067),  (3,-0.181),  (4,  0.186),  (5,  0.200),  (6,  -0.417),  (2) 

(7,  -0.467),  (8,  -0,823),  (0,  -0.888),  (10,  -1.000)}  , 

where  the  first  member  of  each  ordered  pair  is  the  ranking  (smallest  to  largest)  of  the 
absolute  value  of  the  second  member  of  the  ordered  pair.  This  type  of  data  is  often 
used  in  rank  tests,  nonparametric  hypothesis  tests  which  generally  examine  the  mean  or 
median  of  a  distribution  or  the  equality  of  means  or  medians  of  several  distributions. 
Rank  tests  are  sometimes  eschewed  because  once  the  ranking  has  been  established,  the 
data  are  treated  as  though  they  were  equally  spaced;  and  potentially-valuable 
information  concerning  the  proximity  of  the  data  points  is  discarded.  In  the  preceding 
example,  note  that  some  of  the  rankings  may  be  tenuous;  for  example,  ranks  3  and  4 
could  easily  have  been  permuted  had  the  numbers  to  which  they  correspond  been 
inaccurate  in  the  third  decimal  place.  Therefore,  the  degree  of  accuracy  in  the  ranks  is 
directly  related  to  the  degree  of  accuracy  of  the  original  data;  and  this  can  sometimes  be 
a  problem. 

In  many  applications,  the  available  data  may  be  vague  or  imprecise,  due  to  a 
variety  of  reasons  which  may  include  improper  calibration  of  equipment  and  subjectivity 
cf  the  experimenter.  This,  of  course,  can  lead  to  imprecise  ranking  of  the  data  and 
possibly  an  incorrect  conclusion  from  the  resulting  hypothesis  test.  Such  data,  as  well 
as  their  ranks,  can  be  represented  by  fuzzy  numbers'  -  a  relatively  new  concept  in 
which  a  number  is  described  by  a  central  value  along  with  a  spread  about  that  value. 
When  applied  to  ranks,  this  technique  may  overcome  the  previously-mentioned  problem 
inherent  in  rank  tests;  and  in  certain  situations  this  representation  will  allow  for  a  more 
realistic  approach  to  hypothesis  testing. 


II.  FUZZY  RANKS  APPLIED  TO  THE  WILCOXON  SIGNED-RANKS  TEST 

A*  JYitoan  Signed-Ranks-Xest 

The  Wilcoxcu  signed-ranks  test  is  a  nonparametric  hypothesis  test  which  is 
generally  used  to  test  for  equal  medians  of  two  distributions,  The  data  consist  of  paired 
observations  (xit  y;)  from  the  two  distributions.  The  differences  between  the 
observations,  Dj  =*  Xj  -  yit  are  then  calculated;  and  their  absolute  values  are  assigned  a 
rank  Rj  from  smallest  to  largest.  Finally,  Rj  is  multiplied  by  *1  if  Dj  is  negative.  The 
sum  of  the  ranks  of  the  positive  differences,  T  =  Rj,  R;  >  0,  is  the  test  statistic.  If 
the  two  distributions  have  the  same  median,  we  would  expect  about  one-half  of  the  D,’s 


1 


to  be  positive.  Very  high  or  very  low  values  of  T  indicate  that  numbers  from  the  first 
distribution  are  consistently  higher  or  consistently  lower  than  those  from  the  second 
distribution  and,  therefore,  will  cause  rejection  of  the  null  hypothesis  of  equal  medians. 
The  theory  behind  the  test  along  with  tables  containing  various  quantiles  of  T  are 
provided  by  Conover2. 

For  each  ordered  pair  of  the  set  S,  we  can  consider  the  second  value  to  be  Dj  and 
the  first  value  to  be  the  Rj  associated  with  it.  Taking  the  sum  of  the  R;’s  associated 
with  the  positive  D|'s,  we  find  that  T  *  2+4+5  ™  n.  Probability  levels  for  the 
Wilcoxon  signed-ranks  test  for  a  sample  of  size  10  are  given  in  Table  1.  Referring  to 
this  table,  we  find  that  our  value  of  T  indicates  that  there  is  insufficient  evidence  for 
rejecting  the  hypothesis  of  equal  medians  at  a  10%  level  of  significance.  In  this  case  the 
probability  of  T  being  less  than  or  equal  to  11  is  0.0527;  and  since  we  are  performing  a 
two-sided  test  (examining  T  to  see  if  its  value  is  either  too  low  or  too  high),  we  double 
that  figure  to  get  the  critical  level  of  the  test.  Had  the  value  of  T  been  10  or  less, 
rejection  of  the  null  hypothesis  would  have  been  warranted. 

TABLE  1.  Probability  Levels  for  the  Wilcoxon  Signed-Ranks  Test  Statistic 

with  a  Sample  Size  of  10.  * 


T  P 

■aBEMl 

T  P 

T  P 

0 

.0010 

7 

14 

.0067 

21 

.2783 

1 

.0020 

8 

15 

.1162 

22 

.3125 

2 

.0020 

0 

BornTl 

16 

.1377 

23 

.3477 

3 

.0040 

10 

.0420 

17 

.1611 

24 

.3848 

4 

.0008 

11 

.0527 

18 

.1875 

25 

.4220 

5 

.0008 

12 

.0654 

10 

.2158 

26 

.4600 

6 

.0137 

13 

.0801 

20 

.2461 

27 

.5000 

T  *=  sum  of  positive  ranks 

P  «  probability  that  the  sum  of  positive  ranks  will  be  less  than  or  equal  to  T 
under  the  null  hypothesis 

'  Since  the  distribution  of  T  is  symmetrical,  only  one-balf  of  the  distribution  is 
tabulated. 

B,  Fuuy  Raoka 

Let  R  »  {r(,  r2,  ....  rn}  be  a  set  of  elements  and  Q  be  a  subset  of  R.  Then  we  can 
define  the  characteristic  function  /i q:  R-^0,  1}  such  that 


I  1  if  Tj  €  Q 
(ri)  ~  l  o  if  rj  ^  Q  • 


(3) 


If,  however,  R  is  (he  set  of  men  and  Q  is  taken  to  be  the  set  of  old  men,  there  may 
be  some  vagueness  about  the  membership  of  certain  rt  in  Q  Is  a  50-year-old  man  a 
member  of  Q?  1  used  to  think  so;  but  now  that  I’m  older,  I’m  not  quite  so  sure. 
Suppose  we  let  /iq  take  on  values  other  than  0  and  1;  in  particular,  any  value  between  0 
and  1  so  that  p q:  R-»[0,  1], 

In  this  case  Q  is  called  a  fuzzy  subset  of  R  and  /iq  is  called  the  membership 
function  of  Q.  Each  rj  has  associated  with  it  a  value  ft q  (rt)  representing  a  degree  of 
membership  in  Q.  The  closer  this  value  is  to  one,  the  more  completely  the  associated  r; 
is  a  member  of  Q.  Numerical  data  can  be  represented  by  equating  R  with  the  set  of 
real  numbers,  in  which  case  Q  is  called  a  fuzzy  number. 

In  this  application  we  will  examine  fuzzy  numbers  and,  in  particular,  fuzzy  integers 
since  we  are  concerned  with  ranks.  A  fuzzy  number  will  be  represented  by  a 
membership  function  quantifying  the  degree  to  which  it  takes  on  any  specific  value. 
Figure  1  shows  a  membership  function  /i  for  "fuzzy  six".  This  function  assumes  its 
maximum  value  at  six,  /i(6)  n  1;  the  closer  any  number  is  to  six,  the  higher  its  degree 
of  membership  in  "fuzzy  six".  When  we  examine  fuzzy  ranks,  the  membership  functions 
will  be  discrete,  since  our  interest  will  be  only  in  the  degree  of  membership  for  integer 
values. 

This  membership  function  is  not  unique;  rather,  it  is  subjective  -  determined  by  the 
user  and  based  on  his  perception  of  the  vagueness  of  the  data.  In  order  to  fully  utilize 
this  methodology  the  Extension  Principle3  permits  definition  of  a  mathematical 
operation  f  on  two  fuzzy  numbers.  It  states  that  if  X  is  a  fuzzy  number  with 
membership  function  px-{x)  an<^  Y  is  a  fuzzy  number  with  membership  function  /iy(y), 
then  Z  =  f  (X,Y)  is  a  fuzzy  number  with  membership  function 


Hl(i)  =  max  min  (p^x),  py[y)\  • 

x,y 

f(x,y)-z 


(4) 


3  Ztdtb,  LA,  ‘ Thi  Cooctpt  of  •  Liofuulic  Viruble  tod  lit  Applicttio#  to  Approximttt  Rtuotiso  I.  II.  III.* 
InformiUon  Scuntti.  Volt  I  9,  1976 
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Figure  1.  Membership  Function  of  Fuzzy  Six. 


Figure  2  shows  some  membership  functions  established  for  the  absolute  value  of 
three  of  the  members  of  the  original  data  set  (*0.181,  0.186,  0.200).  Recall  that  the  set  S 
contained  ordered  pairs  of  the  form  (Ix,X)  where  X  was  a  number  from  the  original  data 
set  and  lx  was  the  rank  associated  with  the  absolute  value  of  X.  The  shapes  of  these 
membership  functions  are  symmetric  and  triangular  with  a  spread  equal  to  ten  percent 
of  the  largest  value  in  the  data  set  (remember  that  these  are  modeling  decisions). 
Hence,  the  membership  value  of  "fuzzy  0,181”  is  non-zero  from  0.081  to  0.281  and  has 
its  zenith  at  0.181. 

We  can  define  a  membership  function  for  the  first  member  of  each  ordered  pair  • 
the  rank  denoted  by  Ix  -  as  follows: 

Pl  (IY)  «=  max  min  (px(z),  py(z)]  ■  (5) 
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Figure  2.  Membership  Functions  of  a  Portion  of  the  Original  Data  Set. 


This  equation  provides  the  membership  value  for  IY  in  "fuzzy  rank  Ix".  Thus,  in 
Figure  2,  the  top  horizontal  line  intersects  the  ordinate  at  a  point  equal  to  jij(4),  the 
middle  horizontal  line  intersects  the  ordinate  at  a  point  equal  to  /i«(5),  and  the  bottom 
horizontal  line  intersects  the  ordinate  at  a  point  equal  to  /js(5).  This  definition  of  the 
membership  function  for  the  fuzzy  ranks  produces  the  following  properties: 

^Ix^x)  38  !.  (6) 

Pix  (Iy)  =»  0  if  /iX(x)  and  /iv( y)  do  not  intersect,  and  (7) 

Plx  (*y)  "  Ox)  •  (8) 

Figure  3  shows  the  membership  functions  for  the  entire  set  of  original  data.  The 
ordinate  values  of  their  points  of  intersection  are  listed  in  Table  2.  These,  of  course,  are 
the  values  of  jijx  (IY)  shown  in  Equation  4  and  define  the  membership  functions  of  the 
fuzzy  ranks  of  the  data,  such  functions  being  discrete  since  the  ranks  can  take  on  only 
integer  values.  Note  that  the  table  is  symmetric,  a  result  of  Equation  7. 
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Data 

Points 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

1 

1.00 

0.03 

0.36 

0.33 

0.26 

0.00 

0.00 

"Too" 

0.00 

0.00 

0.03 

1.00 

0.43 

0.41 

0.34 

0.00 

0.00 

0.00 

0.00 

0.00 

0.30 

0.43 

1.00 

0.08 

0.91 

0.00 

000 

0.00 

0.00 

0.00 

0.33 

0.41 

0.03 

1.00 

0.93 

0.00 

0,00 

0.00 

0,00 

0.00 

6 

0.26 

0.34 

0.01 

0.93 

1.00 

0.00 

0.00 

0.00 

0.00 

0.00 

e 

0.00 

0.00 

0.00 

0.00 

0,00 

1.00 

0.75 

0.00 

0,00 

000 

7 

0.00 

0.00 

0.00 

0.00 

0.00 

0.75 

1.00 

0.22 

0.00 

0.00 

8 

0.00 

0.00 

000 

0.00 

000 

0.00 

0.22 

1.00 

0.00 

0.00 

0 

0.00 

0.00 

0.00 

0.00 

000 

0.00 

0.00 

0.00 

1.00 

0  44 

10 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0,44 

1.00 

10 


Once  the  membership  functions  of  the  ranks  are  established,  it  is  necessary  to 
calculate  the  value  of  T,  the  sum  of  the  positive  ranks.  T  will  be  the  sum  of  fuzzy 
integers  and,  as  such,  will  be  a  fuzzy  integer  itself.  To  determine  its  membership 
function,  we  refer  to  the  Extension  Principle  and  determine  that 


/iT(t)  -  max  min  (p,(IY|),  •••»  >*io(IyJ1  »  (0) 

(Iy,  *Ya>  *yJ 

irm  Efy*  Yi>° 

1-1 

where  (Iy,,  Jty,,  •••,  IyJ  permutations  of  the  integers  IY ,  IYj,  ....  IY  . 

In  this  case  of  ten  data  points,  T  can  take  on  all  integer  values  between  0  and  55; 
each  of  these  possible  sums  will  have  a  membership  value  associated  with  it.  To  obtain 
/*>r(t),  we  refer  to  Table  2  and  perform  the  following  steps: 

1.  Select  a  permutation  of  the  ranks. 

2.  From  Table  2  determine  the  minimum  membership  value  of  the  ranks  in 
their  respective  positions  for  this  particular  permutation. 

3.  If  that  minimum  membership  value  is  greater  than  zero,  determine  the  sum 
of  the  positions  of  the  positive  ranks  for  this  particular  permutation. 

4.  If  the  membership  value  is  greater  than  the  membership  value  currently 
associated  with  that  sum,  replace  with  the  new  membership  value. 

We  continue  with  this  sequence  of  operations  until  all  the  permutations  have  been 
exhausted,  at  which  time  we  have  associated  with  every  possible  value  of  T  a 
membership  value  which  is  the  maximum  over  all  permutations  of  the  minimums  for 
each  individual  permutation. 

Using  our  set  of  ordered  pairs,  S,  we  can  provide  an  example  of  the  sequence  above: 

1.  Suppose  our  selected  permutation  is  5  1  3  2  4  7  0  8  10  9. 

2.  Referring  to  Table  2,  we  can  see  that  the  membership  value  of  rank  5  in  the 

first  position  is  0.28,  the  membership  value  of  rank  1  in  the  second  position 

is  0.03,  the  membership  value  of  rank  3  in  the  third  position  is  1 .00,  and  so 
forth.  If  any  one  of  these  is  equal  to  zero,  then  the  minimum  is  equal  to 
zero,  and  we  skip  steps  three  and  four.  For  this  particular  permutation,  the 
minimum  membership  value  is  0.26. 
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3.  The  sum  of  the  positions  of  the  positive  ranks  for  this  particular 
permutation  is  equal  to  ten  (first  plus  fourth  plus  fifth). 

4.  If  0.26  is  greater  than  the  current  membership  value  associated  with  a  sum 
of  ten,  then  replace  it. 

When  we  have  examined  all  possible  permutations,  the  membership  function 
associated  with  the  sum  of  positive  ranks,  T,  is  shown  in  Table  3.  Membership  values 
associated  with  T<J5  and  TJ>13  are  all  equal  to  zero. 


TABLE  3.  Membership  Function  Associated  with  the  Sum  of  Positive  Ranks  for 
the  Original  Data  Set  (Non-zoro  Values). 


T 

|*(T| 

6 

0.330 

7 

0.355 

8 

0,005 

9 

0,025 

10 

0.075 

11 

1.000 

12 

0.430 

Of  course,  examining  all  permutations  can  be  very  time  consuming.  This  particular 
case  required  201  seconds  of  central  processor  urit  (CPU)  time  on  a  CDC  7600 
computer.  However,  because  of  the  large  number  of  membership  values  that  were  equal 
to  2ero  (see  Table  2),  many  of  the  permutations  could  be  ignored,  since  resulting 
minimums  would  be  equal  to  zero  and  would  not  affect  subsequent  maximums.  By 
taking  advantage  of  this  information  to  modify  the  permutation  subroutine,  I  was  able 
to  reduce  the  CPU  requirement  to  43  seconds.  Even  with  this  kind  of  reduction,  it  is 
difficult  to  exceed  a  sample  size  of  twelve  without  incorporating  other  shortcuts.  One 
very  effective  method  is  to  segment  the  data  set,  particularly  if  there  is  a  datum  point 
which  is  crisp  rather  than  fuzzy;  that  is,  its  membersh’p  value  at  all  but  one  position  is 
equal  to  zero.  Using  this  characteristic,  I  was  able  to  handle  a  sample  size  of  32  in  a 
later  application  of  this  work. 


m.  INTERPRETING  RESULTS 

When  the  data  were  considered  non-fuzzy,  we  saw  that  there  was  insufficient 
evidence  for  rejecting  the  hypothesis  of  equal  medians.  We  could  have  provided  a 
critical  level  as  defined  by  Conover;  in  doing  so,  we  would  have  concluded  that  the  null 
hypothesis  could  have  been  rejected  at  a  significance  level  of  (see  Table  1  and 

recall  that  we  are  performing  a  two-sided  test). 


Treating  the  data  aa  furry  numbers  provides  a  furry  result  for  T  with  a 
membership  function  described  in  Table  3.  This  allows  for  several  methods  of 
interpretation.  Observing  that  p(T)  «  1  (its  maximum)  when  T«=ll,  we  might  state 
that  there  is  insufficient  evidence  for  rejecting  the  null  hypothesis  at  the  or  «  .10  level. 
Thus,  the  classical  (non-furry)  signed-ranks  test  emerges  as  a  special  case. 
Alternatively,  knowing  that  T=10  was  the  threshold  for  rejection,  we  might  state  that 
the  null  hypothesis  can  be  rejected  at  the  a  **  .10  level  with  a  membership  value  of 
0.075.  Since  we  recognise  the  data  as  imprecise,  perhaps  the  best  alternative  is  to  accept 
the  imprecision  inherent  in  the  resulting  test  statistic  and  make  the  decision  as  to 
whether  or  not  to  reject  the  null  hypothesis  based  on  the  entire  membership  function. 
In  our  example,  the  membership  value  exceeds  0.000  for  T*8  through  T=ll. 
Therefore,  none  of  these  values  should  be  disregarded  when  analysing  the  data;  ibey  all 
became  viable  candidates  for  T  when  the  model  took  into  account  the  proximity  of  the 
data  points.  The  nature  of  any  particular  application  should  assist  in  making  the  final 
decision  less  subjective.  Our  example  represents  a  situation  in  which  the  null  hypothesis 
of  equal  medians  would  not  have  been  rejected  based  on  the  original  data  set  but  may 
be  rejected  when  the  data,  imprecise  in  nature,  are  treated  as  furry  numbers. 

V.  SUMMARY 

Hypothesis  testing  is  an  important  and  useful  tool  for  data  analysis.  When  the  data 
are  vague  or  imprecise,  an  additional  source  of  error  is  introduced  and  may  result  in  an 
incorrect  decision  whether  or  not  to  reject  the  null  hypothesis.  Treating  the  data  as 
furry  numbers  allows  us  to  model  the  uncertainty;  and  manipulating  the  data  using 
furry  arithmetic  allows  us  to  carry  the  uncertainty  through  to  the  final  results,  at  which 
point  a  more  informed  decision  can  be  made. 

Rank  Tests  are  a  class  of  hypothesis  tests  which  are  especially  susceptible  to  the 
problems  of  imprecise  data  since  the  data,  once  ranked,  have  no  further  influence 
regardless  of  how  closely  they  might  be  grouped.  The  Wilccxon  signed-ranks  test  is  one 
example;  and  it  was  this  particular  hypothesis  test  that  was  applied  to  some  data 
assumed  to  be  vague  in  nature.  The  data  were  represented  as  furry  numbers,  and  the 
test  statistic  was  calculated  using  furry  arithmetic.  This  provided  a  final  result  which 
was  itself  a  furry  number,  and  several  methods  of  interpreting  this  result  were 
discussed 

I  found  computer  time  to  be  a  major  problem  with  incorporating  furry  data  into 
rank  tests.  In  this  case  1  needed  to  examine  &)!  possible  permutations  of  rankings  for  all 
the  data.  For  10  data  points  the  problem  is  not  too  bad;  but  if  the  data  set  is  expanded 
to  30  points,  then  even  with  newer,  faster  computers  some  special  techniques  must  be 
applied.  In  most  cases  one  should  be  able  to  segment  the  data  set,  so  that  groups  of  ten 
or  less  can  be  examined  and  the  results  combined  This  should  make  furry  hypothesis 
testing  feasible  as  well  as  reasonable  -  an  even  more  important  and  more  useful  tool  for 
the  statistician! 
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Abstract 

Fuzzy  random  variables  have  been  proposed  to  treat  situations  in  which  both  ran¬ 
dom  behavior  and  fuzzy  perception  must  be  considered.  A  definition  of  independence  is 
given  for  fuzzy  random  variables,  as  well  as  a  notion  of  fuzzy  Gaussian  random  vari¬ 
ables.  It  is  shown  that  a  sum  or  mean  of  independent  fuzzy  random  variables  converges 
in  the  limit  to  a  fuzzy  Gaussia:  random  variable,  thus  providing  a  fuzzy  analogue  of  the 
centra)  limit  theorem  of  classical  probability  theory. 
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Abstract 

Fuzzy  sets  are  useful  as  a  modeling  tool  in  situations  which  have  an  ingredient  of 
uncertainty  or  vagueness,  as  distinct  from  randomness.  One  class  of  problems  fitting 
this  description  arises  in  vulnerability  analysis.  An  application  of  a  fuzzy  random  vari¬ 
able  to  enhance  a  vulnerability  model  currently  in  use  is  discussed. 


Preceding  Page  Blank 


1.  Introduction 


Kwakernaak,  in  a  seminal  paper  [6]  introduced  the  notion  of  a  fuzzy  random  vari¬ 
able  as  a  random  variable  whose  values  are  not  real  but  fuzzy  numbers.  Expectation 
and  probabilities  Hating  to  a  fuzzy  random  variable  are  developed  as  images  of  a  fuzzy 
set,  representing  the  fuzzy  random  variable,  under  appropriate  mappings.  A  natural 
development  of  the  theory  is  to  examine  fuzzy  analogues  of  classical  probability  laws. 

Toward  this  eud,  Kruse  [5j  and  Miyakoshi  and  Shimbo  [8]  report  on  a  strong  law  of 
large  numbers.  Stein  and  Talati  (13),  following  Nnhmias  (9),  develop  a  theory 
specifically  for  convex  fuzzy  random  variables.  Boswell  and  Taylor  (2)  p.ovide  a  fuzzy 
analogue  of  the  central  limit  theorem  for  fuzzy  random  variables  admitting  a  moment 
generating  function  extension.  Puri  and  Ralescu  (11)  outline  a  theory  similar  to 
Kwakernaak's  and  derive  a  dominated  convergence  theorem. 

Application  of  these  potentially  powerful  concepts  has  yet  to  evolve.  Schlegel, 
Shear  and  Taylor  (12)  cite  areas  of  vagueness  in  vulnerability  modeling  and  suggest 
fuzzy  sets  as  a  potential  modeling  tool.  The  implementation  of  one  such  suggestion 
using  a  fuzzy  random  variable  is  the  topic  of  this  paper. 

2.  Fuzzy  Random  Variables 

Kwakernaak  (8)  defines  a  fuzzy  set  f  as  a  triple  f  —  (A,  t,  p)  consisting  of  a  basic 
set  A,  a  logical  proposition  p  which  can  be  applied  to  every  member  of  the  basic  set, 
and  a  function  t  which  assigns  to  every  member  x  c  A  a  truth  vu'”..  t(p(x))  indicating 
the  appropriateness  of  the  proposition  p  as  applied  to  x.  Most  authors  suppress  the  pro¬ 
position  p  notation,  since  it  is  implicit  in  the  organizing  principle  of  the  fuzzy  set,  and 
compose  the  proposition  and  truth  value  into  a  membership  function  /z: A  — ►  (0,  1)  which 
acts  on  the  basJc  set,  /z(x)  as  t(p(x)).  Thus  f  would  be  written  f  =  (A,  /j);  we  shall 
adopt  this  convention. 

An  a-level  set  corresponding  to  &  given  fuzzy  set  f  =  (A,  /z)  is  an  ordinary  non- 
fuzzy  set,  denoted 

Mr)  =  {x<AUM>“>.  (2  1) 

A  fuzzy  number  is  a  fuzzy  set  having  the  real  line  R  as  its  basic  set.  The  fuzzy  number 
f,  or  its  membership  function  /z,  is  said  to  be  unimodal  if  for  every  a  t  (0,  1],  I,a  (f)  is 
convex.  We  shall  be  concerned  with  a  collection  C  of  fuzzy  numbers  defined  as  follows: 
a  fuzzy  number  f  (R,  /z)  belongs  to  C  if  its  membership  function  p  satisfies 

(i)  n  is  upper  semicontinuous, 

(ii)  for  some  x  c  R,  /z(x)  =  1  , 


(iii)  for  all  a  >  0,  I.0  (f)  is  bounded. 

The  set  of  membership  functions  satisfying  (i)  •  (iii)  will  be  called  S. 
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Fuzzy  random  variables  are  constructed  as  a  means  of  modeling  phenomena  which 
could  properly  be  described  by  ordinary  real  random  variables  defined  on  a  probability 
space  (fi,  F,  P),  but  which  are  partially  obscured  by  fuzzy  perception  of  the  real  line.  In 
particular,  if  U0  is  the  underlying  random  variable  and  w  is  the  outcome  of  a  random 
experiment,  the  exact  value  U0(u)  is  unobservable;  instead,  it  is  assumed  that  a  fuzzy 
number  f  (R,  XJ  is  known  which  characterizes  the  result  U0(w).  The  mapping 
X:fi  -*  S  given  by  X(u>)  =*  Xw  supplies  a  membership  function  for  each  random  out¬ 
come,  and  is  called  a  fuzzy  perception  function.  To  the  observer  who  must  perceive 
random  outcomes  via  X,  the  identity  of  U0  is  lost,  and  in  fact  there  may  be  many  recon¬ 
structions  of  U0  which  are  amenable  to  fuzzy  perception.  By  the  standard  operations  of 
fuzzy  logic  [4],  X  generates  a  valuation  function  which  applies  to  random  variables  as 
entities.  Namely,  if  U  is  an  F-measurable  random  variable,  then 

M.(U)-  inf  3UUM)  (2-2) 

u  r  n 

is  the  valuation  of  its  suitability  as  a  reconstruction  of  U0. 

Kwakernaak's  development  of  the  basic  set  of  random  variables  to  serve  as  candi¬ 
dates  for  reconstruction  is  rather  involved.  In  [2]  we  make  some  simplifying  assump¬ 
tions  which  arc  sufficient  for  our  application.  Briefly,  we  admit  as  a  basic  set  Uf ,  the 
set  of  all  F-measurable  random  variables  on  0  ,  and  enforce  partial  retention  of  the 
structure  of  (fi,  F,  P)  through  the  requirement  that  for  all  a  e  (0,  1]  the  functions 

U0*  (u>)  =  inf  {x  e  R  I  Xw  (x)  >  a) 

and  (2.3) 

IV*  (u>)  =  sup  {x  <  R  |  Xw  (x)  >  o) 

are  measurable  with  respect  to  (fi,  F).  The  sigma  algebra  generated  by  the  random  vari¬ 
ables  U0* ,  a  t  (0,  I]  and  UQ",  a  c  (0,  I|  is  denoted  by  <r(X),  and  \  denotes  the  set  of  all 
<r(X)-measurable  random  variables  on  fi. 

Letting  be  the  collection  of  all  F-measurable  random  variables  on  fi,  the  fuzzy 
random  variable  induced  by  X  is  defined  as 

X  =  (LV,  #»„) . 

Some  properties  of  a  fuzzy  random  variable  may  be  obtained  directly  by  the  exten¬ 
sion  principle  (14).  For  example,  the  expectation  of  a  fuzzy  random  variable  X  is  a 
fuzzy  number 

EX  =  (R,  *iEX  ) 

with  membership  function 


^EX(X) 


sup 

U  t  Vft  EU 


inf  Xy  (U  (w)) 
as  x  y  t  fl 


as  sup  #l„  (U),  X  <  R .  (2.4) 

U  <  U/..  EU  as  X 

In  (2.4),  E  denotes  the  usual  mathematical  expectation. 

A  fuzzy  random  variable  X  is  called  unimodal  if  for  each  uufl,  the  membership 
function  Xy  is  unimodal.  Kwakernaak  shows  that  if  X  is  unimodal  the  basic  set  Uf 
may  be  restricted  to  the  set  of  all  <r(X)  'measurable  random  variables  on  0. 

Theorem  (2.1).  If  X  is  unimodal,  then 

f'ExW*  sup  inf  Xy(U(w)),  x  e  R .  (2.5) 

U  <  x:  EU  =  x  u  e  Cl 

8.  Vulnerability  Modeling 

This  is  an  account  of  an  application  of  a  fuzzy  random  variable  to  an  important 
problem  in  vulnerability  modeling.  Succinctly,  vulnerability  modeling  is  an  attempt  to 
characterize  the  interaction  between  a  target  (armored  vehicle,  aircraft,  bunker,  ...)  and 
a  munition  (kinetic  energy  penetrator,  shaped  charge,  explosive  device,  ,.,)  and  to  assess 
quantitatively  the  resulting  damage  sustained  (inflicted)  within  the  target-munition  com¬ 
bination. 

Experimental  testing  required  to  provide  data  pertinent  to  vulnerability  modeling  is 
destructive,  and  the  data  base  upon  which  these  models  are  built  may  be  modest,  or  in 
the  case  of  conceptual  systems,  nonexistent.  Furthermore,  while  certain  damage-related 
measurements  (velocity  of  impact,  depth  of  penetration,  component  function)  may  be 
determined  in  an  unambiguous  manner,  many  others  (structural  deformation,  fracturing, 
component  degradation-of-function)  may  not.  The  composition  of  quantitative  measure¬ 
ments  and  qualitative  information  into  a  cohesive  assessment  of  damage  remains  at  the 
core  of  difficulty  in  vulnerability  modeling.  We  will  consider  a  particular  vulnerability 
model  [10]  currently  in  use  and  demonstrate  the  applicability  of  fuzzy  sets  to  its 
enhancement. 

Figure  1  represents  a  data  summary  of  an  encounter  between  an  armored  vehicle 
and  a  k'netic  energy  penetrator.  A  rectangular  grid  of  10x10  cm  cells  has  been  superim¬ 
posed  on  tue  profile  of  the  armored  vehicle,  and  within  each  cell  of  the  grid,  the  proba¬ 
bility  that  the  vehicle  will  be  rendered  inoperabie  (killed)  should  it  sustain  an  impact 
within  that  cell,  is  listed.  Within  the  bold  rectangle,  for  example,  the  probability-of-kill, 
given  a  hit  in  cell  i,  Pyh,  is  estimated  to  be  .19. 


Fig.  1  function  for  fuuy  l\|,v  wltcro  6  m  Phj,,t  (I  • 


Since  the  membership  functions  ere  unimodal,  the  expectation  of  X  is  a  fussy 
number  EX  with  membership  function 

Mex(*)“  »UP  inf  X*  (U(w)),  xeR.  (3.1) 

U  (  y\  EU  «■  x  w  t  n 

This  expression  can  be  evaluated  using  the  Or- level  sets  (2.1).  Given  the  family  of  level 
sen  L0  (•) ,  the  membership  function  /igx  may  !>•  recovered  with  the  aid  of  the  formula 

*i(x)  «  sup  {or  €  [0,  1]  |  x  <  L0  (•)  >  ,  X  (  R  .  (3.2) 

For  the  simple  membership  function  of  Figure  2,  this  computation  can  be  simplified 
using  procedures  detailed  by  Dubois  and  Prade  (3)  or  Donnisone  [1].  Applying  these  pro¬ 
cedures  to  the  data  in  Figure  1  we  obtain  for  EX  the  membership  fuoctiou  shown  in 
Figure  3. 


i&  .« 


F.|  3  Mtmbtnliip  (unction  for  npcclaliun  LX  of  funjr  random  variable  X 
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4.  Conclusion 

The  form  chosen  for  the  cell  Pjjh)  membership  functions  (symmetric,  normal,  con¬ 
vex)  leads  to  a  membership  function  for  EX  that  is  similar  to  the  constituent  /i(x)  in 
Figure  2.  The  interpretation  of  the  resultant  /ig*  is  that  estimates  of  overall  Pk  in  the 
interval  (.18,  .23)  are,  within  our  framework  of  uncertainty,  wholly  plausible.  The 
impulse  to  take  the  level  set  Lfl0  (').  say,  and  consider  it  a  90%  confidence  interval  for 
Pk  must  be  resisted;  there  are  no  probability  statements  carried  by  the  o-level  sets.  We 
have,  however,  modeled  the  uncertainty  of  the  cell  Pk|h|  estimates  in  the  overall 
probability-of-kill  estimate  Pk  in  a  direct  way,  and  distinguished  between  randomness 
and  uncertainty  in  the  vulnerability  model.  We  also  have  the  framework  in  place  to  con¬ 
sider  Pfcjh,  membership  functions  far  more  intricate  than  the  one  shown  in  Figure  2. 
This  is  a  significant  methodological  improvement. 
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PROBLEMS  ENCOUNTERED  IN  FITTING  A  LARGE  NUMBER  OF  SHORT  TIME  SERIES 
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ABSTRACT.  The  US  Army  Concepts  Analysis  Agency  has  become  Increasingly 
Involved  In  various  forecasting  projects.  Most  of  the  projects  have 
some  common  characteristics.  Typically  each  series  has  less  than  100 
observations  and  often  less  than  50  observations.  Box-Jenk1ns(2)  suggests 
that  more  than  100  observations  are  preferable  and  that  one  uses  experience 
and  past  Information  to  yield  preliminary  models  where  fewer  than  50 
observations  are  available.  Usually  the  Agency  analysts  are  not 
extremely  familiar  with  the  systems  and  processes  which  generates  these 
series.  Each  project  commonly  has  a  set  of  series  which  consists  of 
many  elements.  The  number  of  elements  In  a  set  can  range  from  600  to 
1,000  Individual  time  series.  The  Identified  model  form  of  each 
Individual  time  series  In  the  set  varies  greatly.  Often  only  white 
noise  Is  present.  On  the  other  hand  some  of  the  series  will  exhibit 
seasonal  behavior.  Many  of  the  series  are  nonstationary  and  have 
potential  Interventions.  Many  of  the  series  take  on  only  a  discrete  set 
of  values  such  as  the  set  of  positive  Integers  from  zero  to  ten. 

1.  INTRODUCTION.  One  project  requires  a  forecast  of  the  quantities  of 
various  commodities  shipped  over  various  routes.  The  forecast  of 
potential  loads  would  be  helpful  In  scheduling  limited  transportation 
facilities.  This  project  Involved  about  400  Individual  series  each  one 
describing  the  history  of  a  particular  commodity;  for  example,  coal  over 
a  particular  route,  say  port  of  New  York  to  Europe.  Another  project 
Involved  forecasting  the  number  of  separations  from  the  US  Army  of 
enlisted  grades  E-5  and  E-6  for  about  300  different  military 
occupational  skills.  These  series  are  often  Influenced  by  policy 
changes.  Table  1  Illustrates  the  types  of  forecasting  projects  and 
their  requirements  for  two  recent  projects. 

Table  2  compares  the  results  of  several  different  forecasting  techniques 
giving  a  "best"  forecast,  as  described  further  In  this  paper,  for  a 
selected  sampling  of  time  series  from  these  two  projects. 


Table  1.  Series  Characteristics 


Project 


Characteristic 


Length  of  series 

84 

38 

Length  of  fit 

72 

34,35, 

36,37 

Forecast  horizon 

12 

1 

Total  number  of  series  to  evaluate 

400 

579 

Table  2.  Comparison  of  Forecasts 


Project 


Forecast 


Box-Jenkins 


Winter/Gardner 


Ties 


Unknown 


Total 


2.  PROBLEMS.  Many  problems  arise  In  the  analysis  of  time  series. 
However,  the  literature  Is  limited  on  methods  to  handle  short  series. 
The  Agency  Is  often  confronted  with  both  a  methodological  and  a 
procedural  problem.  The  methodological  problem  is  largely  a  result  of 
the  Inherent  Instability  of  model  form  and  values  of  estimated 
parameters  for  short  time  series.  The  procedural  problem  is  usually 
Imposed  by  study  sponsors  who  require  a  process  which  will  act  In  a 
production  mode  by  Incorporating  new  observations  into  the  forecast  as 
they  become  available. 


Since  no  particular  methodologies  are  suggested  as  superior  for 
short  series  we  have  adopted  an  all  Inclusive  policy.  The  foremost 
technique  for  evaluating  long  time  series  is  the  Box-Jenklns  process 
(Table  3). 


Table  3.  Box* Jenkins  Process 


IDENTIFICATION  ESTIMATION  4  TESTING  FORECASTING 


The  Box-Jenklns  process  Is  a  three  state  process  consisting  of  two 
Iterative  stages;  (1)  Identification,  followed  by  (2)  estimation  and 
testing,  and  finally  (3)  a  forecasting  stage.  The  task  of  Identifying 
400  Individual  series  by  evaluating  the  sample  autocorrelation  and 
sample  partial  autocorrelation  functions  can  be  monumental.  This  Is 
especially  true  when  typically  not  only  the  original  series  must  be 
examined,  but  several  other  series  due  to  nonstatlonarlty  and  the 
consideration  simultaneously  of  seasonal  as  well  as  nonseasonal  model 
forms.  Automated  Identification  Is  an  essential  consideration  In  these 
Agency  projects.  Therefore  we  have  employed  two  pieces  of  software 
which  render  automatic  Identification.  These  software  and  their 
attributes  are  described  In  Table 


Table  4.  Automatic  Identification  Routines  for  Box- Jenkins 

Methods  Employed 


(1)  AUTOBOX 


AUTOMATIC  IDENTIFICATION 
INTERVENTION  DETECTION  (UP  TO  5) 

BOX-COX  TRANSFORMATIONS  (SQUARE  ROOT,  NATURAL  uOG, 
RECIPROCAL  SQUARE  ROOT,  AND  RECIPROCAL) 

(2)  ARIMAID 


AUTOMATIC  IDENTIFICATION 

USES  AKAIKE'S  INFORMATION  CRITERION 

MANUAL  TRANSFORMATIONS  POSSIBLE 


In  addition  to  the  Box-Jenklns  technique  several  techniques  of  modeling 
an  exponentially  weighted  movl no  average  have  been  employed.  Two  such 
techniques  are  the  Holt/WInters^®)  Model  and  the  Gardner-Mckenz1e(3)(4) 
Model.  The  update  forms  of  these  models  are  shown  In  Table  5. 


Table  5.  Multiplicative  Seasonal  Models  Update  Sequences 


WINTERS  -  HOLT 

St  -  St.i*Tt.l*>e,/lt.p 

Tt  "  Tt-I+aYet/lt-p 
lt  ■  lt.p*«l-a)et/St 

Xt(I)  *  <St^Tt)lt.p+1 
WHERE 

Xt  •  OBSERVED  VALUE  TIME  t 
•t  ■  FORECAST  ERROR  AT  TIME  t 
St  •  LEVEL  (MEAN)  AT  TIME  t 
lt  •  SEASONAL  INDEX  AT  TIME  t 
Y  '  TREND  SMOOTHING  PARAMETER 
*  •  TREND  MODIFICATION  PARAMETER 

P  •  NUMBER  OF  PERIODS  IN  A  CYCLE 


GARDNER  NONLINEAR 
et  ■  X^-X^.^ll) 

St  ■  Sj.^+fT^..2+o(2-a)e^/lj.p 

Tt  ■  ♦Tt.j+a(«-P+l)et/lt.p 

Xt(l)  -  (St^Tt)lt.p+1 

*t(l)  ■  ONE-STEP  AHEAD  FORECAST 
AT  TIME  t 

Tt  »  TREND  AT  TIME  t 

a  •  LEVEL  SMOOTHING  PARAMETER 

«  ■  SEASONAL  INDEX  SMOOTHING 

PARAMETER 


The  local  Implementation  of  the  Gardner  Mckenzle  technique  Is  described 
In  Table  6. 


Table  6.  Gardner  Model  Procedure 

1,  FIT  LINEAR  REGRESSION  TO  RAW  OR  TRANSFORMED  DATA  TO  ESTABLISH  BEGINNING 
SLOPE  AND  INTERCEPT,  So  AND  To. 

2,  FOR  NONSEASONAL  MODEL,  ESTIMATE  o,  t,  AND  *  BY  A  GRID-SEARCH  METHOD  TO 
MINIMIZE  MEAN  SQUARE  ERROR. 

3,  FOR  SEASONAL  MODEL,  ESTIMATE  4  BY  HOLDING  a,  v,  AND  4  FIXED  AND  DOING 
A  GRID  SEARCH  FOR  4  TO  MINIMIZE  MEAN  SQUARE  ERROR.  CHOOSE  INITIAL 
SEASONAL  INDICES 

Ij.pTO  I0  SUCH  THAT  1  j _e-P* X.  j/?^.  j 

WHERE  X,j  -  OBSERVATION  I  IN  PERIOD  J 

Hj  •  NUMBER  OF  OBSERVATIONS  IN  PERIOD  J 
P  •  NUMBER  OF  PERIODS 


The  estimation  process  for  the  exponentially  weighted  techniques 
requires  the  development  of  parameter  values  which  are  traditionally 
chosen  In  an  arbitrary  ad  hoc  fashion. 

In  order  to  develop  a  reiterative  process  and  also  to  some  extent  to 
ameliorate  the  stability  problems  of  short  series,  a  sequential 
technique  was  employed.  This  process  Is  described  in  Table  7. 


Table  7.  Algorithm  Followed  In  Project  #2 

(1)  RESERVE  SOME  N*  OF  THE  LAST  OBSERVATIONS  . 

(2)  ITERATE  N*  TIMES  CALCULATION  OF  PARAMETERS  FOR  THE  SEVERAL  MODEL  TYPES 

(3)  USE  N-N*  OBSERVATIONS  ON  FIRST  ITERATION 

(A)  CALCULATE  ONE-STEP  AHEAD  FORECAST  AT  FIRST  ITERATION 

(5)  ON  SECOND  AND  SUCCESSIVE  ITERATIONS  ADD  AN  ADDITIONAL  OBSERVATION  UNTIL 
N-l  OBSERVATIONS  ARE  INCLUDED,  CALCULATE  PARAMETERS  FOR  THE  SEVERAL 
MODEL  TYPES 

(6)  CALCULATE  ONE-STEP  AHEAD  FORECAST  AT  EACH  ITERATION 

(71  CALCULATE  ROOT  MEAN  SQUARE  ERROR  OVER  ITERATED  ONE-STEP  AHEAD  FORECASTS 
FOR  EACH  MODEL  TYPE 

(8)  FIT  ALL  N  OBSERVATIONS  BY  METHOD  YIELDING  MINIMUM  ROOT  MEAN  SQUARE  ERROR 
ON  THE  N*  RESERVED  OBSERVATIONS 


At  each  new  time  point  all  techniques  are  simultaneously  applied  to  the 
time  series.  For  a  certain  period  of  the  recent  past,  In  our  case  the 
last  four  points  are  evaluated  to  determine  a  "best"  technique.  In  one 
project  which  involved  38  quarterly  observations,  the  last  4 
observations  formed  this  recent  past  and  are  referred  to  as  the  reserve 
set.  Four  successive  steps  Involving  34,  35,  36,  and  37  observations  In 
each  series  were  modeled  by  each  of  the  several  techniques.  At  each 
step,  for  each  of  the  techniques,  the  one-step  ahead  forecast  error  was 
calculated  by  subtracting  the  technique's  forecast  from  the  true 
observation.  A  very  rough  measure  of  selecting  the  best  technique  to 
forecast  the  39th  point  In  this  case  could  be  made  by  selecting  that 
technique  which  yielded  the  minimum  mean  square  error  on  the  reserve  set 
( 1 . e . ,  times  35,  36,  37,  and  38). 

3.  Examples.  Table  8  gives  a  frequency  chart  for  63  typical  series 
selected  from  the  project!  each  series  In  the  project  Involved  38 
quarterly  observations  described  above.  Each  row  specifies  the  model 
form  type  Identified  by  the  automatic  Identification  software  for  the 
Box-Jenklns  technique  described  above.  Each  column  Identifies  one  of 
the  techniques  employed  In  the  local  process  described  above.  Each 
element  of  the  table  gives  the  number  of  series  among  the  63  which  were 
Identified  as  a  particular  ARIMA  form  and  were  "best"  modeled  by  a 
particular  technique.  Figure  1  Illustrates  a  model  which  was  Identified 
as  autoregressive  nonseasonal,  and  this  form  was  selected  as  "best"  on 
the  reserve  set  (l.e.,  minimum  mean  square  error  over  time  points  35, 

36,  37,  and  38).  Figure  2  Illustrates  a  series  Identified  as  a 
nonstationary  seasonal  moving  average,  and  this  form  was  selected  as 
"best"  on  the  reserve  set.  Figure  3  Illustrates  a  series  Identified  as 
nonseasonal  autoregressive,  but  a  Winters/Holt  Model  was  selected  as  the 
"best"  on  the  reserve  set.  Figure  4  Illustrates  a  series  Identified  as 
white  noise,  but  a  model  of  the  Gardner  Multiplicative  Nonseasonal 
Nonlinear  Trend  type  was  selected  as  the  "best"  on  the  reserve  set. 
Figure  5  Illustrates  a  series  Identified  as  white  noise,  but  a  model  of 
the  Wlnters/Holt  Multiplicative  Seasonal  was  selected  as  the  "best"  on 
the  reserve  set.  From  this  sampling  of  typical  time  series  It  Is 
evident  that  the  all  Inclusive  process  picked  model  forms  from  a  variety 
of  techniques.  It  Is  amazing  that  even  series  Identified  as  white  noise 
by  examination  of  their  sample  autocorrelation  and  partial 
autocorrelation  can  sometimes  be  better  approximated  over  the  reserve 
set  by  an  exponentially  weighted  moving  average  model. 

Figure  6  Illustrates  the  preponderance  of  these  63  series  which  have 
only  a  small  set  of  values.  In  this  sampling  26  out  of  63  (about  41 
percent)  of  the  series  take  on  values  from  zero  to  ten. 


Table  8.  Model  Frequency 


MINIMUM  MSE  OVER  RESERVE  SET 


4.  SUMMARY.  It  Is  difficult  to  obtain  guidance  from  the  literature  on 
how  to  evaluate  short  time  series.  Even  though  the  sample  statistics 
from  which  time  series  forms  are  Identified  become  very  unstable  for 
short  series,  there  Is  an  operational  need  for  forecasting  the  short 
series.  This  paper  has  described  attempts  to  cope  with  a  real  problem 
In  the  face  of  little  guidance.  The  purpose  of  this  paper  Is  to  solicit 
further  guidance  on  the  subject.  Specifically  we  would  like  to  ask 
several  questions; 

a.  What  forecasting  method(s)  are  recommended  for  situations 
Involving  hundreds  of  "sho?'t"  series  with  little  time  to  accomplish? 

b.  Why  does  not  Box-Jenklns  make  a  better  showing  with  respect  to 
the  exponentially  weighted  moving  average  models? 

c.  How  do  you  recommend  comparing  forecasts  from  several 
techniques? 

d.  Are  there  any  special  techniques  for  treating  series  which  take 
on  only  a  small  set  of  values  such  as  the  Integers  zero  to  ten?. 
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ABSTRACT:  This  is  a  praliminary  rsport  on  the  fsasibility  of  a 
.  predictive  modal  for  an  Army  data  source. 

The  following  seven  tasks  will  be  taken  to  determine  the 
feasibility  of  a  predictive  model  for  a  certain  Army  data  source: 

1.  Clarification  of  the  description  of  the  data. 

2.  Classification. 

3.  Determination  of  the  effects  that  influence  the  data. 

4.  Stratification  of  the  data  by  location,  mission  usage, 

etc. 

5.  Examination  of  the  quality  of  the  data. 

6.  Adjustment  of  data.  Effects,  time  period,  outliers,  etc. 

7.  Selection  of  a  predicting  model. 

I .  INTRODUCTION : 

a.  An  adequate  data  source  is  important  for  obtaining 
reliable  results  from  statistical  analysis.  However,  if  the  data 
source  is  inadequate,  the  choice  of  analytical  techniques 
selected  to  perform  an  analysis  can  improve  the  validity  of  the 
results  and  thus  increase  the  accuracy  of  the  prediction.  The 
existing  Army  data  collection  methodology  is  not  fully  compatible 
with  known  predictive  techniques.  It  is  difficult  to  analyze  the 
existing  data  statistically  and  to  obtain  useful  and  valid 
information  such  as:  safety,  reliability,  readiness,  cost,  mean 
time  between  failures,  mean  time  between  replacement,  or 
maintenance  cost  of  certain  systems.  It  is  even  more  difficult 
to  use  the  presently  collected  data  for  predicting  any  of  the 
above  information  with  a  high  confidence  level. 

b.  This  is  a  preliminary  report.  The  report  addresses  some 
of  the  ways  in  which  current  Army  data  sources  may  be  used  in  the 
application  of  ;■>  predictive  technique  and  of  some  of  the 


technique*  that  could  be  util lead  for  conducting  prodictiva 
analysis,  given  adaguata  data. 

c.  In  section  II,  some  of  tha  appllcabla  pradiction 
techniques  ara  praaantad.  Tha  basic  requirements  for  a  data 
aourca  to  ba  compatible  with  predicting  tachniquaa  ara  diacusaad 
in  aaction  III.  Then  the  problematic  arete  for  both  Army  data 
aourcas  and  predicting  tachniquaa  ara  stated  in  Section  IV.  in 
Section  V,  tha  approaches  to  ba  taken  to  determine  the 
feasibility  of  generating  a  "Predictive  Analysis  Model"  for  an 
Army  data  source  are  discussed.  Some  of  the  possible 
applications  for  a  predictive  analysis  technique  are  provided  in 
Section  VI . 

II.  Predicting  Techniques  and  Fitting  Criteria .  The 
predicting  (or  forecasting)  techniques  discussed  here  are  the 
known  scientific  ones.  The  structures  of  these  scientific 
predictions  can  be  determined  by  statistical  and  mathemat ical 
methods.  Although  each  technique  is  somewhat  unique  in  its 
predicting  capability,  in  practice,  it  has  been  found  that  a  very 
large  group  of  data  can  be  fitted  with  a  "reasonable 
confidence-level"  by  one  of  the  following  basic  models  or  one  of 
their  combinations:  constant  mean,  linear  trend,  linear 
regression,  autoregression,  moving  average,  seasonal  and  periodic 
models,  and  exponential  and  non-linear  models.  A  brief 
description  of  each  of  these  techniques  is  given  below.  (See 
Gilchrist) 

A.  The  constant  mean  model.  This  technique  is  of  the  form 

Xt  *  u  +  Ct  ’  t  »  1,2,3,..  . 

where  u  is  the  constant  mean  of  all  xt's  and  tt  is  one  of  a 
sequence  of  independent  random  variables  with  zero  expectation, 
i.e.  E(c.e)-0,  and  constant  variance  e3.  Fitting  criteria:  zero 
mean  error,  reasonable  small  confidence  interval.  This  method 
dealt  vith  a  set  of  data  fitted  approximating  to  the  global 
constant  mean. 

B.  Linear  trend  model.  This  technique  is  of  the  form, 


where  a  has  the  expectation  of  xe  ,  6  is  a  constant  slope  and  c 

is  a  sequence  of  independent  random  variables  with  zero  * 

expectation,  i.e.  E(  ct  )-0,  and  variance  of  ,  Var(ct)»  o2  . 

Use  the  least  square  method  for  the  fitting  criteria. 

The  method  deals  with  the  data  structure  showing  a  linear  trend 
with  a  random  variation  added. 


NOTE:  Both  techniques  AfcB  use  only  the  past  values  of  the 
variable  being  forecasted,  the  future  values,  and  thus  these 
approaches  are  limited  to  obtain  the  best  forecasts  because  it 


Tails  to  usa  tha  influanca  information  eontainad  in  other 
variablaa. 

C.  Ragraaaion  model.  The  general  for®  of  this  tachnique  is 

k 

yt  “  iio  eixti  +  et  •  1  "  1*2*3» 


wharaBt'a,  i*0,l,...,k  are  constants,  and  **•'*  are  variables 
related  to  ,  and  ct  is  a  random  variation.  Whan  tha  data 
structure  shows  a  saasonal  trend,  soma  of  's,  may  be  replaced 
by  harmonic  terms.  Tha  criteria  for  fitting  is  the  mean  square 
forecasting  error. 

NOTE;  Random  variables  in  techniques  A,B  and  C,  discussed  above 
are  simply  added  "errors”  which  were  added  tc  a  strictly 
deterministic  function.  Technique  c  will  also  make  use  of  other 
information  that  is  related  to  the  one  being  forecasted  so  that  a 
better  forecasting  product  could  be  obtained. 

D.  Autoregressive  model.  This  technique  is  of  the  form 

p 

xt  a  tk  Vt-i  +  £t  »  t  “  1*2»3»  ••• 

Whereas,  i»l,...p  are  constants  estimated  from  given  data,  and 
«*'■  are  identically  distributed  with  zero  mean  and  constant 
variance,  i.e.  u(  et)»0,  and  Var(  ai.  It  is  usually  denotad 

by  AR (p) ,  where  p  is  the  order  of  this  autoregressive  model  and 
is  a  positive  integer.  For  fitting  criteria  see  page  81  of 
Pankratz, 

E.  Roving  average  model .  This  technique  is  on  the  form 

xt  "  ct  +  j»l  6jEt-j  ’  t  “  1,2*3» 

Where  ♦J’s,  j-l,...,g  are  constants  estimated  from  given  data, 
and  et's  are  identically  distributed  with  zerc  mean  and  constant 
variance,  i.e.  w(c  and  Var(e<J«=  0*  It  is  usually  denoted 

by  MA(q),  where  q  is  the  order  of  this  moving  average  model  and 
usually  is  a  finite  positive  integer.  For  fitting  criteria  see 
page  61  of  Pankratz. 

NOTE:  Techniques  D  4  E  are  stochastic  models  ard  have  the  random 
variables  play  the  dominate  part  in  aeterr i n. r.g  the  structure  c: 
the  models.  Box  and  Jenkins  (1970)  mixed  autoregression  and 
moving  average  models  into  one  model  that  will  improve  the 
forecasting.  This  integrated  model,  usually  called  Box-Jenkins 
model,  is  denoted  by  ARKA(p,q).  Moreover,  sore  nor.-stationary 
models  may  become  stationary  by  replacing  the  xt's  by  differences 
of  x4's.  The  d-th  difference  is  obtained  by  taking  differences 
for  the  d-th  time  from  xt's.  The  integrated  autoregressive  - 
moving  average  model,  denoted  by  ARIMA (p, d, q)  is  a  result  of 
combining  d-th  differencing  process  and  ARMA (p , q) .  (See  Box  and 
Jenkins  for  mathematical  forms.) 


F.  Seasonal  and  periodic  models.  These  techniques  dc  not 
have  a  unique  form.  They  deal  with  the  data  to  display  a 
repeating  pattern  at  a  certain  period.  Some  methods  often  used 
to  deal  with  seasonal  data  are:  Seasonal  index  methods,  Fourier 
methods  (see  Gilchrist) ,  and  stochastic  (or  integrated 
autoregression  -  moving  average)  methods  (see  Box- Jenkins, 
Gilchrist,  Jenkins  and  Pankratz.) 

G.  Exponential  and  non-linear  models.  In  many  situations, 
e.g.  growth  or  decay,  the  data  can  only  be  fitted  to  provide  a 
reliable  forecast  by  exponential,  logarithmic  parabola,  or  their 
modified  curves. 

Remarks : 

1.  In  the  situation  where  the  data  structure  is  highly 
stable  and  the  chosen  model  is  the  truth  about  the  underlying 
structures  of  the  data,  the  model  is  called  a"global  model".  In 
other  situations  where  the  data  structure  is  not  stable  overall 
but  it  is  stable  in  the  short  run,  and  this  instability  may  not 
affect  the  techniques  ability  for  forecasting  over  a  short 
period,  the  model  for  forecasting  over  a  short  period  is  called 
"local  model".  There  are  no  differences  in  the  mathematical  or 
statistical  formulation  of  these  two  types  of  models.  The  only 
difference  is  in  the  way  in  which  the  models  are  used. 

2.  Most  predictions  involve  forecasting  more  than  one 
variable.  If  the  variables  are  independent  then  each  variable  is 
predicting  as  a  univariate  forecasting.  If  the  variables  have 
soma  correlation,  then  multivariate  forecasting  should  be  used. 
The  techniques  for  multivariate  forecasting  are  studied  by 
various  Time  Series  Analysts.  (See  Hannan,  Jones  and  Robinson.) 

Ill .  Basic  Requirements  for  a  data  source  to  be  compatible  • 
with  predictive  tecnlques. 

It  is  easy  to  see  that  the  quality  of  forecasting  can  not  be 
any  better  than  the  quality  of  data  available  for  analysis.  But 
it  is  nearly  impossible  to  define  the  quality  of  data. 

Generally,  the  data  should  provide  the  information  to  meet  the 
following  requirements  (see  Gilchrist) : 

A.  The  data  should  provide  directly  relevant  information. 

?J.  The  data  should  provide  reliable  information. 

C.  The  data  should  continually  and  promptly  provide  new 
information. 

Th4  directly  relevant  and  reliable  information  will  help 
obtain  forecasting  results  with  higher  accuracy,  and  the  new 
information  is  essential  for  validation  of  the  model.  In 
practical  forecasting,  there  are  many  occasions  in  which  the  data 
sets  do  not  satisfy  these  three  conditions.  In  this  event,  the 
following  processes  may  help  to  improve  the  forecast. 


A.  Th«  data  should  be  obtained  by  the  forecasters 
themselves . 

a.  The  data  should  be  examined  to  see  how  well  the  three 
requirements  are  met. 

b.  The  data  should  also  include  information  about  the 
external  environment. 

B.  Where  the  data  were  not  obtained  by  the  forecasters,  the 
forecasters  must  use  the  robust  and  exploratory  methods  to  cope 
with  the  sets  of  data  in  an  informal  way  that  will  provide  the 
forecasters  with  data  structure,  and  an  extensive  reportoire  of 
methods  for  the  detailed  study  of  the  data. 

IV.  Problematic  areas  of  existing  Army  data  collections  and 
statistical  Predictive  Techniques. 

A.  Data  sources. 

In  general,  Army  data  collected  are  affected  by  the 
following  conditions  that  are  outside  the  controls  of  the  data 
collector  or  analysts: 

i.  Fleet  changes  in  aircraft  configurations. 

ii.  Periods  when  the  fleet  was  grounded. 

iii.  Changes  in  usage  rates  which  provide  for  more  or  less 
exposure  to  replacement. 

iv.  The  data  are  not  fully  compatible  with  classical 
statistical  analysis  or  predictive  analysis  techniques. 

v.  There  are  multiple  National  Stock  Numbers  (NSN)  and 
part  numbers  (NP) ,  manufacture  lots,  etc.  for  an  individual 
generic  piece  on  some  systems,  (e.g.  parts  in  aircraft) 

vi.  The  location  changes  in  fleet  employment  have  been 
shown  to  influence  part  replacement  rates. 

vii.  There  arc  dynamic  changes  in  maintenance  procedures: 
inspection  intervals,  inspection  activity,  repair  levels,  part 
rework  procedures,  etc. 

viii.  Some  data  collections  programs  do  not  contain  the 
relevant  information  about  the  external  environment. 

ix.  In  most  cases  the  forecasters  are  not  involved  in 
data  collecting  plan  nor  process. 

B .  Statistical  Predictive  Techniques . 

a.  Risk/Confidence  Levels:  The  statistical  predictions 


always  involve  some  uncertainty  and  the  desired  confidence  level 
is  not  always  attainable. 

b.  Compatibility  with  data:  Only  data  with  a  stable 
structure  can  be  forecasted  with  low  risk  and  high  confidence 
levels. 

c.  Predictive  Capabilities:  The  confidence  level 
decreases  as  the  number  of  lead  predictions  increase. 

d.  Data  requirements:  In  order  to  obtain  the  validity 
of  statistical  analysis  the  three  minimum  requirements  mentioned 
in  Section  III  should  be  met. 

V.  Approaches.  -  Since  the  basic  requirements  for  a  data  set 
to  be  compa t i b 1 e  with  the  predictive  methodologies  and 
problematic  areas  are  identified,  the  forecaster  must  select  an 
approach  to  filter  out  the  unwanted  information  and  to  obtain  a 
forecast  model  with  high  reliability.  The  following  tasks  must 
be  accomplished  to  obtain  a  more  valid  analytical  output  of  a 
predictive  model  with  an  acceptable  confidence  level. 

Task  1.  Clarification  of  the  description  of  the  data. 

The  purpose  of  this  task  is  to  select  from  a  given  data  set  the 
most  relevant  information  needed  for  prediction.  If  the 
forecaster  was  not  directly  involved  in  collecting  the  data,  then 
intensive  interviews  with  data  collectors  and/or  field  visits 
should  be  done  to  understand  how  the  data  were  initially  obtained 
and  the  standard  method  used.  This  will  assist  the  forecaster  in 
the  selection  of  the  forecasting  model  and  in  the  interpretation 
of  the  forecasting  results. 

Task  2.  Classification.  There  are  many  kinds  of  weapon  and 
equipment  systems.  The  scenario  and  environment  in  which  each  of 
these  systems  may  be  used  could  have  significant  impact  on  the 
collection  and  structuring  of  the  data,  the  selection  of  the 
forecasting  techniques,  and  In  the  interpretation  of  the  forecast 
results.  Therefore  it  is  essential  for  the  forecaster  to 
classify  the  commodity,  i.e.,  aircraft,  missiles,  etc., of  the 
systems  for  which  forecasting  efforts  are  to  be  applied  and  to 
identify  the  prediction  rationale  prior  to  the  initiation  of  a 
forecasting  exercise. 

Task  3.  Determine  the  effects  that  influence  the  raw  data. 
There  are  certain  effects  that  are  known  to  influence  the  raw 
data.  Some  of  these  have  already  been  mentioned  in  Section  IV. 

A  suitable  adjustment  should  be  made  for  the  effects  to  improve 
the  forecasting  accuracy. 

Task  4.  Stratification.  Many  data  collections  contain  vast 
amounts  of  information  obtained  from  different  locations, 
missions,  usages,  climates,  etc.  Each  group  contains  relevant 
information  for  its  particular  predicting  purpose. 


Task  5.  Examine  the  quality  of  the  data.  Use  the  three 
minimum  requirements  in  Section  III  to  determine  the  quality  of 

the  stratified  data  and  whether  any  adjustments  need  to  be  made. 

Task  6.  The  adjustment  of  data.  Most  Army  data  are  not 
collected  for  the  purpose  of  forecasting.  The  raw  data  may  need 
to  be  modified  to  allow  or  disallow  for  some  features  before  the 
date  can  be  used  in  the  prediction  process. 

a.  Adjusting  for  known  influencing  causes.  Some  of  tho 
known  causes  have  already  been  mentioned  in  Section  IV.  The  ways 
of  dealing  with  these  vary  greatly. 

b.  Adjusting  for  timo  period.  Almost  all  forecasting 
methods  assume  that  data  are  input  at  fixed  intervals.  If  this 
is  not  so,  then  some  adjustment  has  to  be  made  to  produce  a  new 
data  set  of  the  required  intervals.  For  example,  adjust  monthly 
data  set  into  quarterly  or  yearly  data  set,  or  adjust  yearly  data 
set  into  monthly,  bi-monthly,  quarterly,  etc.  data  set. 

c.  Adjusting  data  by  transformations.  Some  data  sets 
with  nonstationary  variance  may  be  transformed  into  a  stationary 
one  by  natural  logarithms.  Some  data  sets  with  a  nonstationary 
mean  may  be  transformed  into  a  data  set  with  a  stationary  mean  by 
applying  a  differencing  procedure. 

d.  Adjustment  for  outliers.  Most  practical  forecasting 
systems  contain  quality  control  procedures  that  will  pick  out 
values  which  are  in  some  sense  extreme  (by  engineering  judgement, 
or  out  of  some  standard  deviations  from  the  mean  error,  etc.), 
and  are  called  outliers.  In  an  operational  forecasting  system, 
it  is  advisable  to  replace  the  outliers  with  some  other  suitable, 
but  less  extreme,  values  so  that  the  leading  forecasts  will  not 
be  influenced  by  outliers. 

Task  7.  Selection  of  a  predicting  model.  Having  completed 
the  above  six  tasks  for  a  data  set,  there  remains  to  examine  the 
three  minimum  requirements  of  Section  III  again  before 
constructing  a  model  for  its  forecasting.  In  general,  the 
forecaster  begins  to  fit  the  simplist  model  with  the  reduced 
data.  If  the  model  does  not  fit  well,  according  to  its  criteria, 
then  the  next  model  should  then  be  tried.  It  is  often  that  a 
forecaster  can  not  obtain  a  model  with  the  confidence  level 
desired.  Some  experienced  judgement  must  be  made  before  the 
final  model  selection,  or  sometimes  a  forecaster  may  use  a  model 
and  then  modify  it  as  new  data  come  in. 

VI.  Applications :  The  ability  to  predict  (forecast)  a  given 
operational  parameter  of  a  system  is  one  of  the  most  important 
elements  of  logistic  support  and  managerial  decision.  The 
predicting  analysis  techniques  can  help  project  operational 
readiness,  dependability,  safety  and  hence,  the  probability  of 
mission  success  of  a  system.  These  techniques  also  can  assist  in 
the  development  of  a  mathematical  model  for  provision  planning  of 
a  system,  manpower  maintenance  planning,  logistic  supoort 


planning*  ate.  Moreover,  the  result  of  the  predicting  analysis 
can  be  used  to  assess  the  engineering  design  specification  and 
influence  engineering  designs  or  changes. 


Summary: 

There  are  numerous  sources  of  data  collection  in  the  Army 
community.  It  is  worthwhile  to  investigate  which  of  these  data 
collections  may  be  used  to  predict  safety,  reliability,  cost, 
etc.  There  are  also  several  known  forecasting  techniques  that 
are  available,  and  the  forecaster  must  use  discretion  in  the 
preparation  of  the  data  to  be  used  with  each  technique,  as  well 
as  the  selection  of  the  technique. 
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QUANTILE  STATISTICAL  DATA  ANALYSIS 
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Abstract. 

This  paper  presents  some  reasons  why  theoretical  and  sample  quantile  functions  should  be  routinely  used 
by  contemporary  statistical  data  analysts.  Quantile  methods  are  Introduced  in  the  context  of  the  exponential 
distribution  as  a  fit  to  the  historically  important  life  table  data  of  Graunt  (1661).  Section  titles  are:  history 
of  statistics  and  contemporary  textbooks;  quantile  concepts;  Identification  quantile  function;  identification 
quantile  box  plot;  tail  classification  of  probability  laws;  goodness  of  fit  plots;  IQQ  plot;  cumulative  weighted 
spacing*  function  D( u);  quantile  simulation  and  distribution  of  extreme  values;  comparison  quantile  function; 
nonparametric  estimation  of  probability  density;  conclusion. 

1.  History  of  Statistics  and  Contemporary  Textbooks. 

A  central  problem  of  statistical  data  analysis  (that  was  formulated  by  16th  century  pioneors  such  as 
Quetelet  (1796*1874)  and  Galton  (1822-1911)]  is  Identifying  distributions  that  fit  the  data.  In  The  Hittory 
of  Statiitici,  Stigler  (1986)  writes  (p.  268)  that  thess  pioneers  emphasised  the  us*  of  normal  curves  to  fit 
data;  they  'proposed  that  the  conformity  of  the  data  to  this  characteristic  [normal]  curve  was  to  be  a  sort 
of  test  of  the  appropriateness  of  classifying  the  data  together  in  on*  group;  or  rather  the  nonappearanc*  of 
this  curve  was  indicative  that  the  data  should  not  be  treated  together.' 

By  1875  Galton  'had  devised  a  different  way  of  displaying  the  data.  He  ordered  the  data  in  increasing 
order,  and,  effectively,  graphed  the  data  values  versus  the  ranks.'  Galton  used  the  name  'ogive'  for  the 
theoretical  form  of  this  curve  for  a  normal  distribution;  Stigler  writes  'we  now  call  it  the  inverse  normal 
cumulative  distribution  function’.  I  call  this  ideal  graph  a  quantile  function  of  the  normal  distribution;  the 
graph  of  ordered  data  values,  denoted  Jffo'jn),  versus  (;'  -  .5)/n  or  j/(n  +  l),  is  called  the  sample  quantile 
function,  denoted  Q"(u),0  <  u  <  1. 

This  paper  presents  some  reasons  why  theoretical  and  sample  quantile  functions  should  be  routinely 
used  by  contemporary  statistical  data  analysts.  They  can  be  used  to  not  only  test  the  fit  (or  lack  of  fit)  of  a 
normal  distribution  to  data,  but  also  to  describe  other  general  families  of  distributions  and  to  identify  which 
distributions  fit  the  data. 

Textbooks  with  title*  such  as  Introduction  to  Contemporary  Staiutical  Method *  omit  many  important 


topics  that  are  actually  useful  in  the  theory  and  practice  of  statistical  data  analysis.  On  my  list  of  important 
topics  (for  which  I  always  look  in  the  index  and  usually  fail  to  find)  are:  uniform  distribution,  exponential 
distribution,  order  statistics,  extreme  values,  quantile  function.  Traditional  introductory  textbooks  describe 
methods  based  on  mean  and  variance.  To  qualify  as  'contemporary'  a  textbook  adds  the  following  topics: 
box  plot,  fences,  stem  and  leaf  plot,  trimmed  and  Winsorised  sample.  In  iny  opinion  quantile  function 
Interpretations  are  needed  for  these  topics  to  acquire  beauty  and  utility  that  will  excite  students;  however 
how  to  do  this  is  not  explicitly  discussed  in  this  paper. 


We  introduce  the  ideas  of  quantile-based  statistical  data  modeling  in  the  context  of  the  exponential 
distribution.  Let  X  be  a  continuous  random  variable  with  distribution  function  F(x )  =  Pr|X  <  z]  and 
probability  density  function  /(x)  ■  F'(x). 
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We  call  F[x)  an  exponential  distribution  with  parameter  A  if 

1  -  F(x)  —  exp(-Ax),*  >  0 ,/(*)  *■  Aexp(— Ax),*  >  0 
It*  mean  n  equal*  l/A,  since  (for  a  non-negative  random  variabh) 

ft  —  f  xf(x)dx  ■  f  (1  -  F(x))dx  »  /  exp(— Ax)dz. 

Jo  Jo  Jo 

The  standard  exponential  distribution  is  the  exponential  distribution  with  mean  1. 
a.  Quantile  Concepts. 

The  QUANTILE  FUNCTION  <J(u),0  <  u  <  1,  is  the  inverse  x  —  F“l(tr)  of  the  distribution  function 
u  ■  F(x).  To  find  x  ■»  <J(u)  one  solves  u  ■>  /(*). 

For  an  exponential  distribution,  one  obtains  z  —  Q(u)  by  solving  1  —  u  —  exp(-Ax);  therefore 

Q(u)  -  (1/A)  log(l  -  u)"1  -  m(-  iog(l  -  u)) 

The  mean  n  of  a  distribution  F  or  random  variable  X  can  be  computed  from  the  quantile  function  Q: 


The  MEDIAN  and  QUARTILE8  of  a  distribution  F  or  random  variable  X  are  defined  to  be 

Q(.5),Q(.2S),Q(.75), 

the  values  of  <J(u)  at  u  -  .8,  .25,  .75.  We  define  QUARTILE  DEVIATION  DQ  by  DQ  -  2(Q(.75)  -  Q(.25)). 

For  an  exponential  distribution,  Q(. 5)  ■»  /a log 2  .89/ii  Q(-25)  «•  /ilog(4/3)  «  .20/s;  Q(.75)  ■«  /slog  4  « 

1.39/j.  The  interquartile  range  Q(. 75)  -  Q(.25)  -  l.lp;  quartile  deviation  DQ  -  2(Q(.75)  -  Q(.25)j  -  2.2/i. 

Two  important  quantile  concepts  are  q( u)  “  Q'(u),  QUANTILE  DENSITY  FUNCTION,  and  /Q(u)  ■» 
f(Q( u)),  DENSITY  QUANTILE  FUNCTION.  For  F  continuous,  jF(Q(u))  -  u  and  fQ(u)q(u)  m  1.  For  a 
standard  exponential  distribution,  /Q(u)  »  1  -  u. 

Two  important  universal  measures  of  Kale  of  a  distribution  are  DQ  and  1/ /(median)  «  1//<J(.5)  =» 
q{. 5).  They  approximately  equal  each  other  because  DQ  is  a  numerical  derivative  of  Q(u)  at  u  ■  .5. 

How  do  we  apply  these  concept*  to  determine  distributions  that  fit  data?  Given  data  (sample)  compute 
a  sample  quantile  function  denoted  Q‘( u).  The  sample  distribution  function  is  defined  by  F*(z)  ^fraction 
of  sample^  x;  the  sample  quantile  function  Q"(u)  is  the  inverse  of  F‘( u).  In  terms  of  the  order  statistics 
X(l!n)<...£X(n;  n)  of  a  sample 

Q*(u)  -  Af(;;  n)for(/  -  1) ,/n  <  u  <  j/n. 

On*  usually  adopts  a  continuous  version  of  the  sample  quantile  function  defined  by  linear  interpolation 
between  its  values 

<?’((>  -  •*)/")  -  *(;».;  ■ 

When  true  mean  n  ■  18,  and  the  distribution  is  exponential,  Q(,5)  «  12.4, Q(, 25)  =  5.2,  Q(. 7$)  =  25. 
If  similar  values  hold  for  the  sample  analogues  of  population  parameters  (denoted  by  adding  a  tilde  O  to 
the  population  not\tion)  one  euspects,  and  conjectures,  that  an  exponential  distribution  fits. 


Table  l.  Graunt’s  Lipe  Table  (ieei).  Observed  proportion  and  cumulative  pro¬ 
portion  IN  VARIOUS  INTERVALS  OF  OBSERVED  VALUES  OP  AGE  AT  TIME  OP  DEATH  (IN  LONDON 


Age  Interval 

Proportion 

Cumulative  proportion 

#*(u(, -!))-#>(/)) 

p(i) 

«*(j) 

16-20 

26-36 

30-46 

40-56 

56-66 

66-76 

70-86 


.36-  H«) 

.60  -  r(  16) 

.75  -  J"(20) 
.84  -  r,(36) 


Table  2.  Graunt’s  Life  Table  Sample  Quantile  Function. 


J 

«(» 

QX;)) 


5 

6 

7 

8  2 

1  .20 

.24 

.27 

.29  1 

I  46 

56 

66 

70  1 

For  an  illustrative  example  we  consider  Graunt's  Li/e  Table  data  (that  should  be  familiar  to  all  students 
of  statistics).  It  was  published  in  1661  by  John  Graunt,  in  an  attempt  to  analyse  data  dealing  with  age 
at  time  of  death  in  London.  The  original  data  was  collected  by  Thomas  Cromwell  in  1534  from  Chnrch  of 
England  records  of  births  and  deaths.  Graunt  is  credited  with  starting  modern  statistics  by  creating  Table  1. 
Brilliant  lectures  by  James  R.  Thompson  of  Rice  University  brought  this  important  data  set  to  my  attention. 

from  Graunt's  life  table  (Table  1)  one  computes  sample  mean  p"  -  18.22  (in  words,  the  average  age  at 
death  was  approximately  18  years),  #"(.25)  »  4.2,  #*(.5)  —  11.8  (median  age  at  death  was  approximately  12 
years),  #*(.75)  —  26,  DQ  —  43.6.  These  are  found  by  interpolating  the  values  of  the  sample  quantile  function 
in  Table  2. 

To  compute  sample  mean  (from  grouped  data)  we  uee  formulas 


M*  ■  £  -5(#*(u(;  -  1))  +  #>(;')))(«(;)  -  u(;  -  l)) 

■  53(00'))  -  <?'(«(/  -  1))(1  -  5(u(y  -  1)  +  u (;))) 
y-i 

The  second  formula  can  be  interpreted  ueing  the  feet  that  1  -  u  is  the  standard  exponential  density  quantile. 

It  does  not  seem  to  be  customary  in  the  literature  to  discuss  which  distributions  fit  the  data  that  one  is 
analysing  (here  Graunt's  life  table).  Techniques  are  discussed  in  this  paper  which  can  guide  the  statistical 
data  analyst  to  identify  and  test  standard  parametric  distributions  (such  as  the  exponential  distribution)  os 
a  smooth  distribution  that  fits  the  sample.  We  discuss  the  respective  roles:  (i)  F‘(x),  sample  distribution 
function,  (ii)  #"(u),  sample  quantile  function,  (iii)  F'{x),  smooth  distribution  estimated  from  data  (for 
Graunt  life  table,  an  exponential  distribution  with  mean  18.22),  (iv)  #*(u),  smooth  quantile  function,  (v) 
D-(u)  -  F\Q‘( u)),  companion  quantile  function,  (vi)  D'( u),  cumulative  weighted  spaeingi,  test*  constancy 
of  ratio  of  derivative*  #*,(u)/#*'(u),  (vii)  #/(u),  identification  quantile  function.  The  etatistician’e  problem 


is  to  develop  a  framework  which  explains  how  and  why  to  use  these  functions  to  develop  graphical  and 
numerical  diagnostics  which  guide  us  to  identify  distributions  (such  as  the  normal  or  exponential)  that  fit 
the  data. 

8.  Identification  Quantile  Function. 

The  median,  which  we  henceforth  denote  MQ  =  Q(, 5),  is  a  universal  measure  of  location.  It  is  superior 
to  the  mean  by  the  criterion  of  being  mors  robust  (resistant  to  outliers  in  the  data  whose  presence  will  in 
fact  be  detected  by  the  identification  quantile  function).  But  we  recommend  the  median  not  because  of  its 
robustness  but  because  it  forms  one  of  the  tools  of  quantile  baaed  methods  of  statistical  data  analysis. 

Statisticians  who  favor  (or  at  least  teach)  mean  and  standard  deviation  os  measures  of  location  and 
scale  use  them  to  standardise  the  data  by  subtracting  the  mean  and  dividing  by  the  standard  deviation. 
The  quantile  based  analogy  to  standardisation  is  to  transform  the  random  variable  X  to 

XJ-  (X-MQ)/DQ 

whose  quantile  function  is 

Q/(u)  -  {Q( u)  -  MQ)/DQ 

We  call  QI( u)  the  Identification  Quantile  Function.  Our  motivation  for  introducing  this  function  is  that 
it  is  approximately  equal  to  the  unitised  quantile  function 

Ql(u)  -  (Q(u)  -  MQ)/Q'(.S)  -  /Q(.5)(Q(u)  -  MQ). 

which  has  value  0  and  slope  1  at  u  ■  .5.  The  probability  density  /(x)  corresponding  to  the  unitised 
quantils  function  has  been  normalised  so  that  /(median)**  1.  Ths  unitised  normal  probability  density  is 
/(*)  -  sxp(-rrx3). 

Universal  measures  of  location  and  scale  are  MQ  and  DQ.  Diagnostic  measures  of  skewness  are 

QI(. 25),  Q/(.75),  QIM  -  .5(Q/(.25)  +  QJ(.75)),  -.25/Q/(.25),  ,25/Q/(.75); 

note  that  always  QI{. 75)  -  Q/(.25)  -  .5.  Diagnostic  measures  of  (left  and  right)  tail  behavior  are  Q/(.01)  and 
QI(M).  A  combined  measure  of  tail  behavior  (useful  for  probability  density  estimation)  is  Q/(.99)-Q/(.01), 
called  the  identification  quantile  range. 

4.  Identification  Quantile  Boot  Plot. 

An  identification  quantile  box  plot  is  a  plot  consisting  of  a  box  from  QI(. 25)  to  Q/(.75)  with  a  midline 
at  QI(. 5)  =  0  and  a  cross  at  QIM.  Fences  are  deDned  to  be  max(-l,  Q/(0))  and  min(l, Q/(l)).  Lines 
are  drawn  from  identification  quartiles  to  fences.  Data  values  outside  the  fences  are  considered  outliers  or 
out-and-  outliers,  depending  on  whether  they  are  interpreted  as  representing  long  tails  or  blunders.  One 
also  indicates  the  location  of  (sample  mean ~MQ)/DQ.  The  values  of  identification  quartiles  and  fences  are 
recorded  on  the  plot. 

0.  Tall  Classification  of  Probability  Laws. 

Representations  of  the  density  quantile  function  behavior  as  u  tends  to  0  or  1  is  used  to  provide  a 
quantitative  index  of  tail  behavior  which  we  call  the  tall  exponent.  It  is  used  to  qualitatively  classify  tail 
behavior  in  three  types,  called  short,  medium,  and  long.  Medium  tails  are  further  classified  in  three  groups: 
medium-short,  medium-medium,  medium-long;  a  good  summary  of  these  concepts  introduced  by  Parsen 
(1979)  is  given  by  Schuster  (1984). 

These  five  groups  reduce  to  three  groups  (short,  medium,  long)  when  expressed  in  terms  of  haiurd  rale 
function*  (decreasing,  constant,  increasing).  The  right  and  left  hasard  function*  are  respectively  defined  by 

M*)  -/(*)/(! -/,(*)),fco(*) -/(*)//'(*). 


The  right  end  left  hasard  quantile  functioni  ere  defined 

h,<3(u)  -  /<?( u)/(l  -  u),h0Q(t<)  -  fQ{ u)/u. 

Our  classifications  of  tail  behavior  can  be  empirically  related  to  the  behavior  of  the  identification  quantile 
function  aa  u  tende  to  0  or  1.  The  left  tail  ia  claaeified:  0  >  Qf(.Ol)  >  -.5,  ehort  tail;  -.S  >  QI(u)  >  -1, 
medium-short;  -1  >  QJ( u),  medium-long  and  long  tail.  The  right  tall  is  classified  short,  medium  short,  or 
long  according  as  Q/{.99)  <  .5,  .5  <  QI{M)  <  1, 1  <  QI(. 99). 

Pbr  Oraunt's  Life  Thble,  Q/(.25)  -  -.17,Q/(.75)  «  .33 ,QIM  -  .8Q/(.75)  +  Q/(.28)  -  ,08,QJ(.01)  - 
-.27,QJ(.99)  ■  1.47.  Experience  with  typical  values  of  these  diagnostic  meuures  for  various  standard 
frequently  encountered  distributions  leads  one  to  conjecture  that  the  sample  distribution  function  F"(x)  of 
the  data  in  Table  1  Is  fit  by  an  exponential  distribution  F*(x)  with  a  suitable  estimated  mean  p”. 

0.  Goodness  of  Fit  Plots. 

To  evaluate  the  fit  of  a  model  described  by  F*(x)  or  Q‘(u)  to  data  described  by  F*(x)  or  Q~(u)  one 
has  a  bewildering  number  of  options.  The  theory  of  goodness  of  fit  tests  is  concerned  with  the  theoretical 
study  of  the  many  test  statistics  available,  and  offers  little  practical  guidance  on  which  methods  to  use  in 
practice.  This  extensive  literature  can  only  be  briefly  illustrated  in  this  paper,  with  emphasis  on  graphical 
comparisons. 

One  can  comparo  plots:  (1)  /’*(*)  and  F"[s)  vs.  x,  on  the  same  graph;  (2)  Q*(u)  vs.  Q*( u),  called  Q-Q 
plot;  (3)  D'( u)  -  F*(Q'(u))  vs.  u,  called  D-uniform  plot  (it  is  equivalent  to  a  plot  of  ]T{x)  vs.  F~(x)  called  a 
P-P  plot),  We  recommend  variants  of  the  last  method.  One  can  Interpret  D"( u)  as  sample  quantile  function 
of  the  transformed  random  variable  IT  -  F*(X),  The  goodness  of  fit  problem  is  transformed  to  teste  of  fit 
of  U“  by  a  uniform  (0,1]  distribution  and  by  estimation  of  the  true  quantile  function,  denoted  D( u),  of  [/*. 
We  call  Dm( u),  0  <  u  <  1,  a  sample  comparison  quantile  function. 

When  is  exponential,  J3*(u)  «»  1  -  exp(-Q*(u)/p‘).  Its  values  for  Oraunt's  life  data  is  given  in  Table 
3.  Figure  2  presents  a  IQQ  plot  as  a  test  of  fit  of  Oraunt's  life  table  by  an  exponential  distribution.  Figures 
S-A  present  plote  on  same  graph  of  sample  and  smooth  distributions.  The  combinations  are  F*(x)  and  F'(x) 
vs.  x  (Figure  3),  Q‘( u)  and  Q‘(u)  vs.  u  (Figure  4),  Q’( u)  vs.  Q*( u),  a  Q-Q  plot  (Figure  5),  and  F*(x)  vs 
/"(x),  a  P-P  plot  (Figure  A)  which  also  plots  XT(u)  -  l**(Q*(u)).  Figures  A  and  7  present  13(u)  plots  aa  testa 
of  fit  of  Gaunt's  life  table  by  an  exponential  distribution;  D'{v)  «■  cumulative  weighted  spacings  in  Figure  7. 

T.  IQQ  (Identification  quantile  •  quantile)  Plot. 

To  test  whether  a  sample  is  normal  or  exponential,  one  tests  the  hypothesis  Q( u)  -  p  +  oQo(u)  by  a 
scatter  plot  of  (Qo(u(»),(}‘(u(j')))  at  suitable  values  u(j),;  «  I,...  ,k,  in  the  interval  0  <  u  <  1.  This  plot, 
called  a  Q-Q  plot,  is  Judged  visually  for  linearity. 

We  prefer  to  use  what  we  call  a  IQQ  plot;  it  is  a  scatter  diagram  of  (Q0f («(;)),  Q7(u(;)))  with  a  grid 
of  lines  which  may  make  it  easier  to  Judge  visually  for  linearity.  A  IQQ  plot  for  Graunt’s  life  table  is  given 
in  Figure  2. 

8.  Cumulative  Weighted  Spacinge  Function  D( u). 

Users  of  QQ  and  IQQ  plots  report  that  they  are  difficult  to  interpret.  I  propose  that  one  should  prefer 
plots  that  are  graphs  of  functions  such  as  various  functions  D( u),0  <  u  <  1,  which  can  be  defined  to  measure 
the  'distance'  between  two  distributions. 

To  compare  Q(u)  with  n -*•  cQ0( u)  we  recommend  comparing  their  derivatives  (equal  to  ?(u)  and  o<7o(u) 
respectively).  S'nce  a  is  unknown  we  test  for  constancy  the  ratio  q(u)/qQ{u)  *  q(u)f0Q0[u)\  equivalently 
test  the  deviation  from  1  of 


d(u)  -  q(u)/0Qo(u)/«To, 


*0  =  /  g(t)/o9o(0<ft- 
Jo 

We  call  d(u)  a  weighted  (pacing*  function,  line*  (pacing*  X(fc;  n)  -  X(k  -  1;  n)  are  the  buiidirg  blocks  of 
•stlmatora  of  q( u). 

Ons  approach  to  testing  cf(u)  is  to  estimate  and  test  the  deviation  (from  the  uniform  function  Z>o(u)  =  u) 
of  the  cumulative  weighted  spacing*  function 

2>{u)  -  I* 

Jo 

The  sample  analogue  of  d(u)  and  £(u)  to  test  exponentiality  is:  for  u(;  -  1)  <  u  <  u (j),  <T(u)  » 

<T(j)  -  (Q-(u(y))  -  <?'{ u(j  -  1)))(1  -  .5(u(j  - 1)  +  u(i)))/M'i 
D~( u)  linearly  interpolates  its  values  D“(u(j))  ®>  <f  (1)  + ...  +  «T(/).  Note  that  <r<T  -  /*'. 

Table  8.  ORAUNT'S  LIFE  TABLE  Q‘,  <5*,  F‘,  F*(Q‘)*»  D’  FOR  FITTED  EXPONENTIAL  F'(x)  m 
1  -*xp(-x/p'),/i'«  18.2,  D"(u)  CUMULATIVE  EXPONENTIAL  WEIGHT  SPACING!  (CUMWTSPAC). 


j 

<?>(>)) 

rQMi)) 

F'Q'( «(;)) 

X>*(u(j))cUMWTSPAO 

m 

.00 

0 

.00 

.00 

.00 

i 

8.18 

6 

.36 

.28 

.27 

2 

lft.ftO 

10 

.60 

.58 

.56 

3 

26.2ft 

20 

.75 

.76 

.73 

4 

83.30 

36 

.84 

.86 

.85 

5 

41.95 

46 

.00 
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.92 

ft 

61.2ft 

50 

.04 
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.06 

7 

03.80 

ftft 

.07 

.97 

.986 

8 

83.91 

7fl 

.09 

.98 

.997 

0 

96.54 

86 

1.00 

.99 

1.00 

Figures  fl  and  7  show  how  we  plot  I?*(u)  for  comparison  with  Dq(u)  -  u.  In  .  ldltion  to  the  graphical 
diagnostic  of  the  plot,  there  are  many  numerical  diagnostics  that  can  be  perfomu  .. 

0.  Quantile  Simulation  and  Dletribution  of  Extreme  Valuee. 

A  general  distribution  function  F(x),-oo  <  z  <  oc,  is  a  non-  decreasing  function  continuous  from  the 
right.  Its  quantile  function  (or  inverse  distribution  function),  defined  by 

Q(u)  -inf(x:  F(x)  >  u), 

is  a  non-decreasing  function  continuous  from  the  left.  It  is  an  inverse  under  inequality;  for  any  x  and  u 

F(x)  >  u  if  and  only  if  x  £  Q( u). 

An  Important  property  of  quantile  functions  is  a  formula  for  function*  of  random  variable*.  THEOREM. 
Assume  g  is  non-decreaelng  and  continuous  from  the  left.  Then  Y  -  g(X)  ha*  quantile  function 


One  can  represent  X  in  terms  of  a  uniform  [0,l]  randor*.  variable  U  by  X  =  Q(U)  since  Q(U)  has 
quantile  function  Q[Q y(u))  »  Q( u). 

When  F  is  continuous,  on#  can  transform  X  to  U,  a  uniform  |0,l]  random  variable,  by  U  **  F(X)  since 
F(X)  ha#  quantile  function  /(Q(u)}  »  u. 

A  random  sample  X(l), . . . ,  X(n)  of  X  can  be  simulated  by  generating  a  random  sampl#  C/(l), . . , ,  E/(n) 
of  U,  and  forming  X(j)  -  Q(U(j)).  This  process,  illustrated  in  Figure  8  for  th#  normal  and  Cauchy  dietribu- 
tions,  demonstrates  that  the  quantile  function  provides  a  powerful  graphical  representation  of  a  distribution 
because  of  the  following  equivalence:  (1)  a  random  sample  of  X,  (2)  observing  Q(u),  quantile  function  of 
X,  at  a  random  sample  of  points  on  th#  unit  interval.  To  compare  two  distributions,  such  sa  tho  normal  or 
Cauchy,  one  way  is  to  plot  (as  in  Figure  8)  graphs  of  their  identification  quantile  functions  plotted  on  the 
same  scale  (the  longer  tailed  one  will  have  to  be  truncated  at  a  suitable  value). 

The  representation  of  X  in  terms  of  U  by  X  »  Q(C7)  provides  a  quantile  approach  to  the  distribution 
theory  of  order  statistics  and  extreme  values.  Let  X(l;n)  <  ...  <  X(n;  n)  be  the  order  statistics  of  a 
random  sample  X(l),...,X(n).  The  kth  order  statistic  X(k\n)  has  the  same  distribution  as  Q(U(k\n)) 
where  U(k’,n)  is  the  kth  order  statistic  of  a  random  sample  from  uniform  [0,1]. 

10.  Comparison  Quantile  Function. 

A  quantile  based  concept  that  unifies  parameter  estimation  and  goodness  of  fit  hypothesis  testing 
procedures  is  the  comparison  quantile  function  D{ u)  ■  F(G~l  (u))  which  compares  two  distribution  functions 
F(x)  and  O(x).  The  comparison  quantile  density  is 

d(u)  -  £K(u)  -  /(0-‘(u))/a(0“‘(u)) 

The  Kullback  information  divergence  can  be  evaluated  by 

I(0\F)m-f  (log(/(*)/p(i))p(s)dsr  -  f  -logd(u)du 

J -go  *0 

The  graph  of  d(u)  provides  insight  into  the  rejection  method  of  simulation.  One  seeks  to  generate  a 

sample  X(i), . . .  ,X(m)  from  F  as  an  acceptable  subset  of  a  sample  V(l) . K(n)  from  G(x).  THEOREM. 

Assume  that  D[ 0)  ■  0  and  there  is  a  constant  e  such  that  d( u)  £  e  for  all  u.  Generate  two  independent 
uniform  [0,1]  random  variables  U{  1)  and  t/(2).  Acceptance  and  rejection  rule:  If 

U(2)  <  d{U( l))/e, 

then  accept  Y  ■>  G-,(t/(l))  as  an  observed  value  of  X.  Otherwise  reject  Y.  (Continue  by  generating  two 
more  uniform  [0,1]  random  variables).  The  probability  of  acceptance  is  1/c. 

The  relation  botween  two  distributions  F  and  G  is  best  understood  by  a  plot  of  «j  -  d(u i). 

This  plot  can  be  used  to  graphically  describe  the  rejection  rule  of  simulation  and  to  prove  it.  Verify  that 
tl.e  area  under  the  curve  from  ut  ■  0  to  ■  G(r)  equals  D(G(x))  =>  F(x);  the  event  that  1/(1)  <  G[x) 
and  U( 2)  S  d((/(l))/o  has  probability  F{x)/c\  the  event  that  X  S  x  can  be  shown  to  have  probability  F(x). 

11.  Nonparametrlc  Estimation  of  Probability  Density. 

To  identify  distributions  that  fit  data,  one  can  use  parametric  models  such  as  the  location-scale  parame¬ 
ter  model  Q(u)  *»  p  +  oQo(u),  ur  one  can  nonparsmetrically  form  estimators  /*(x)  of  the  probability  deneity 
function  (eee  Silverman  (1986)).  We  consider  only  the  kernel  estimator 


/*(x)-(l/n)^(l/h)/f(U-X(j))/h) 


WSmmm 


where  K(x)  U  a  probability  density  function  end  h  ie  a  bandwidth  to  be  selected. 


For  K  we  recommend  iParsen  (1062))  the  'Panen  window’  which  is  the  probability  density  of  the  sum 
of  four  uniform.  f  (4/3)  -  8x>  4  8x*\  0  <  x  <  .6 

(8/3) ( 1  -  x)8,  .8  <  x  <  1 

0,  1  <  * 

U(-x),  x  <0 

As  *  first  choice  to  consider  for  h,  by  adapting  Silverman  (1086),  p.  47,  we  recommend 


K(x) 


Kept  ~  K{0)DQn  ‘-2 

To  accept  or  reject  the  goodness  of  the  value  of  h  chosen  we  judge  the  deviation  from  uniformity  of  the 
comparison  quantile  function  D’{u)  =  Jp*(^*(u)).  V/e  evaluate  this  function  at  u  -  [j -  ,5)/n  by  F‘(X(y,n)). 
Other  choices  of  hopt  are  multiples  of  hopt  based  on  diagnostics  or  the  tail  behavior  of  the  distribution,  given 
by  Q/(.00)  -  Q/(.0l).  The  deviation  of  D‘(u)  from  uniformity  is  used  to  guide  the  search  for  the  best  value 
of  h  for  the  data  being  analysed. 

The  details  of  this  procedure  for  choosing  a  kernel  probability  density  estimator  cannot  be  given  in  this 
paper.  It  is  best  explained  by  examples  of  the  quality  of  nonparametric  probability  density  estimators  to 
which  it  leads  for  famous  data  seta  (Buffalo  snowfall,  Yellowstone  geysor  eruption  times)  which  are  used  as 
test  eases  for  density  estimation  methods  (compare  Silverman  (1986)). 


12.  Conclusion. 

Thi  process  of  analysing  a  univariate  sample  can  be  viewed  as  fitting  a  smooth  distribution  F‘(x)  to 
a  sample  distribution  F(x).  The  process  of  comparing  F'  and  r  requires  a  knowledge  of  the  theory  and 
practice  of  quantile  functions.  ’In  order  to  got  to  the  fruit  of  the  tree  you  have  to  go  out  on  a  limb *  is  a 
proverb  that  statisticians  may  take  as  an  omen  that  they  should  explore  the  quantile  limb  which  is  always 
lurking. 
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ABSTRACT 

Wu  (1985)  proposed  an  efficient  class  of  sequential  designs  for  estimating 
response  distribution  quantiles  in  a  sensitivity  test  enviroment.  Here,  a 
good  performer  within  that  class  of  designs,  logit-MLE,  is  conpardd  to  a 
Delayed  Robb  ins -Monro  procedure  in  which  the  final  quantile  estimate  is 
obtained  via  maximum  likelihood.  Their  similar  Monte-Carlo  performance 
under  rrany  test  conditions  is  discussed.  Implications  for  sample  sire 
determination  when  estimating  the  median  and  3rd  guartile  sure  briefly  con¬ 
sidered. 


I.  INTRODUCTION 


A  sensitivity  test  is  a  destructive  test  in  which  a  level  of  stimulus  is  applied  to  an 
experimental  unit  and  a  binary  response  is  observed.  The  binary  response  is  commonly 
referred  to  as  a  success  or  failure  under  the  level  of  stimulus  chosen. 

Each  member  of  the  population  from  which  experimental  units  are  sampled  is 
assumed  to  have  a  critical  stimulus.  For  a  given  experimental  unit,  a  stimulus  applied 
at  or  above  the  critical  stimulus  level  would  necessarily  result  in  a  success.  A  stimulus 
below  this  critical  level  would  result  in  a  failure.  Critical  stimulus  is  a  continuous  ran¬ 
dom  variable  which  is  not  directly  observable,  but  rather  only  through  success  or  failure 
is  it  observed.  A  success  or  faillire  constitutes  only  partial  or  indirect  information,  as  it 
indicates  only  whether  the  stimulus  level  chosen  was  at  or  above,  or  below  the  critical 
stimulus  for  that  unit. 

In  a  sensitivity  test,  an  adequate  characterization  is  desired  of  some  region  or  quan¬ 
tile  of  the  response  distribution  -  the  distribution  function  of  the  random  variable,  criti¬ 
cal  stimulus.  Such  a  characterization  lends  insight,  in  a  probabilistic  sense,  to  the  sensi¬ 
tivity  of  the  population  to  various  levels  of  stimulus.  To  this  end  extensive  literature 
exists,  some  of  which  is  contained  in  our  list  of  references. 

Much  of  the  work  contained  in  these  references  pertains  to  sequential  design  and 
estimation.  In  many  applications,  data  are  expensive  to  collect  and  are  gathered  most 
cost  effectively  in  a  sequential  manner.  This  is  the  case  in  large-caliber-munition  testing 
for  the  Army.  In  this  sequential  setting  our  dual  objective  is  first,  to  choose  a  good 
design  and  estimation  procedure  among  those  available;  second,  to  briefly  consider  sam¬ 
ple  size  determination  for  estimating  distribution  quantiles  at  specified  levels  of  precision 
and  under  a  variety  of  test  conditions. 


II.  DESIGN  AND  ESTIMATION 

The  proposal  of  a  new  class  of  sequential  designs  and  a  detailed  comparison  of  the 
new  class  to  existing  procedures  is  given  by  Wu  (1985).  Under  varied  test  conditions  a 
comparison  of  these  procedures,  some  of  which  are  modified,  is  given  by  Bodt  and 
Tingey  (1987).  Drawing  from  these  two  studies,  only  the  Delayed  Robbins-Monro  with 
maximum  likelihood  estimation  and  the  logit-MLE  will  be  considered  as  candidate  pro¬ 
cedures. 

The  Delayed  Robbins-Monro  (DRM)  is  a  modification  of  the  Stochastic  Approxima¬ 
tion  Method  of  Robbins  and  Monro  (1951).  Denote  the  nth  level  of  stimulus  as  xn,  the 
ntb  response  as  yn  and  the  quantile  of  interest  as  Lp.  Let  yn  =  1  signify  a  success  and 
yn  =>  0  signify  a  failure,  Then  referencing  the  work  of  Kesten  (1958),  Cochran  and 
Davis  (1984),  Davis  (1989)  the  next  design  points  for  a  DRM-c  design  are  given  by, 

xn+l  =  xn  -  c(yn-  p)  (1) 

where  c  is  an  appropriately  chosen  constant  according  to  the  variance  of  the  population. 


Data  is  collected  in  this  manner  until  a  reversal  occurs.  Reversal  is  the  occurrence  of  a 
(success,  failure)  or  (failure,  success)  in  succession.  Subsequent  design  points  are  chosen 
according  to  the  usual  Stochastic  Approximation  Method  by, 

xn+‘  -  xn  "  7^7  (yn  -  P)  (2) 

where  k  is  the  first  sample  number  corresponding  to  the  first  reversal. 

The  primary  advantage  to  delaying  the  reduction  in  step  site  until  the  first  reversal 
is  evident  in  the  common  situation  where  a  reasonable  gu'*s  for  the  quantile  location  is 
not  available-  The  design  refrains  from  attempted  convergence  until  some  indication 
(reversal)  of  being  in  the  desired  region  is  present.  This  convention  makes  the  most 
sense  if  the  quantile  of  interest  is  the  median  but  will  be  used  here  for  the  .75  quantile 
as  well.  Davis  (1060)  shows  the  DRM  to  be  a  good  performer. 


The  logit* MLG  is  one  application  of  Wu's  general  technique  found  by  him  to  be 
effective  in  estimating  the  quantiles  of  the  distribution.  The  next  design  point  is  taken 
to  be  the  desired  quantile's  maximum  likelihood  estimate  based  on  all  of  the  data  gath¬ 
ered  up  until  that  point.  The  maximum  likelihood  procedure  assumes  a  logistic  model, 
hence  the  name  logit-MLG.  Silvapulle  (1081)  shows  that  the  unique  existence  of  this 
maximum  likelihood  estimate  is  guaranteed  by  a  zone  of  "mixed"  results;  the  necessary 
and  sufficient  condition  for  which  can  be  expressed, 


(Xl«nln»  X1 


mu 


)  n(x* 


(3) 


mln>  X  mix)  $ 

where  x!m|n  is  the  minimum  level  of  stimulus  at  which  a  success  was  observed.  In  the 
first  few  tests  there  is  reasonable  likelihood  that  this  condition  will  not  be  satisfied. 
Furthermore,  use  of  maximum  likelihood  estimation  on  only  a  few  sensitivity  data 
points  often  results  in  poor  estimates.  What  is  needed  is  another  data  collection  pro¬ 
cedure  to  be  used  until  the  logit-MLG  can  be  applied.  In  this  study,  the  Delayed 
Robbins-Monro  was  used  until  condition  (3)  was  satisfied  and  more  stable  maximum 
likelihood  estimates  were  likely. 

An  algorithm  for  this  procedure  is  to  collect  data  as  per  DRM-c  until  condition  (3) 
is  satisfied  or  sample  point  six  has  been  reached,  which  ever  comes  later.  After  which 
time  the  next  design  point,  xn+(,  is  taken  to  be  the  logit-MLE  with  restrictions  imposed 
by  the  following  equations. 

If  dn  is  the  solution  of 


Xn  n-k+1 


(yn-p) 


where  Lp  is  the  logit-MLE  for  the  pth  quantile  based  on  n  observations,  then 

d* 


where 


X->  ”  Xn  n-k+1 
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d*  =  max  (6,  min  (dn,d))  d  >  6  >  0. 


(4) 

(5) 

'•>v 

% 

(6) 

1 

?95 


This  restriction  prohibits  the  procedure  from  varying  wildly  in  its  choice  of  the  next 
design  point.  Henceforth,  the  above  procedure  will  be  denoted  MLE(c,d).  For  a  more 
detailed  discussion  see  Wu  (1685). 

Preparing  to  compare  the  two  procedures,  DRM-c  and  MLE(c,d),  we  note  that 
when  no  prior  knowledge  of  the  distribution  is  available  Wu  (1685)  finds  MLE(c,d)  to  be 
better  than  the  RM  type  designs  he  examined.  In  addition,  Bodt  and  Tingey  (1687) 
show  in  a  Monte-Carlo  study  that  MLE(c,d)  specifically  out  performs  DRM-c  under  a 
variety  of  practical  test  conditions  such  as  restricted  sample  sizes  (<  15),  stimulus  noise, 
varied  response  distributions,  and  varied  combinations  in  the  selection  of  the  initial 
design  point  and  the  constant  c.  Based  on  this  work  we  will  make  one  final  modification 
of  the  estimation  technique  when  using  DRM-c.  To  motivate  that  change  we  will  first 
take  a  brief  digression. 

One  goal  of  this  experimentation  is  to  precisely  estimate  a  quantile  of  the  critical 
stimulus  distribution.  Since  the  advent  of  sequential  procedures  in  this  setting,  much  of 
the  attention  has  been  placed  on  asymptotic  conveyance  properties.  It  is  true  that 
many  of  these  procedures  for  collecting  data  also  serve  to  consistently  estimate.  In 
addition,  designs  such  as  DRM-c  are  nonparametric  so  no  restrictive  model  assumptions 
need  be  made  regarding  the  shape  of  the  response  distribution.  For  these  reasons  some 
experimenters  have  ceased  to  separate  design  and  estimation  when  considering  this 
problem.  Consistent  with  the  experiment  goal  mentioned,  the  performance  of  various 
combinations  of  design  and  estimation  procedures  are  examined,  Bodt  and  Tingey 
(1987).  In  the  restricted  sample  size  environment  we  found  that  if  data  were  collected 
using  DRM-c  and  estimation  was  carried  out  via  maximum  likelihood,  the  results  were 
as  good  or  better  than  for  any  other  design  and  estimation  scheme  studied.  This  result 
was  true  under  the  variety  of  practical  test  conditions  mentioned  previously, 

Thus  the  promised  modification  is  that  when  using  DRM-c,  data  is  collected  as  that 
procedure  dictates;  but  final  estimation  is  accomplished  using  the  same  logit- MLE  tech¬ 
nique  as  per  Wu’s  procedure.  We  will  continue  to  refer  to  this  combined  design  and  esti¬ 
mation  scheme  as  DRM-c. 


Ill.  A  SIMULATION  STUDY 

Before  making  sample  size  determinations,  we  wished  to  first  compare  DRM-c  and 
MLE(c,d)  under  practical  test  conditions  and  sample  sizes  which  are  not  unduly  res¬ 
tricted.  This  comparison  was  performed  in  a  Monte-Carlo  study  under  the  crossed  con¬ 
ditions  listed  in  Figure  1.  For  this  part  of  the  study  the  .5  quantile  was  estimated.  The 
measure  of  precision  was  (MSli)1/2  The  number  of  iterations  performed  was  500  per 
treatment  combination. 


Figure  1.  Factors  Included  in  the  Design. 

Four  response  distributions  representing  a  variety  of  shapes  were  chosen.  Each  had 
median  equal  to  zero.  Three  were  given  a  standard  deviation  of  unity.  The  quartiles  of 
the  Cauchy  were  made  to  match  those  of  the  normal  distribution.  The  purpose  in  con¬ 
sidering  c  ■»  10,  SO  for  DRM-c  is  that  we  wished  to  compare  DRM-c  to  MLE(c,d)  under 
suboptimal  conditions  for  the  data  collection  aspect  of  DRM-c  while  maintaining  good 
conditions  for  MLE(c,d).  Based  on  Wu's  findings,  d«*30  should  yield  good  results  for 
MLE(c,d).  Through  results  of  Chung  (1054)  and  Hodges  and  Lehmann  (1955)  the 
optimum  choice  of  c  for  the  usual  Stochastic  Approximation  Method  is  (F*  (.5))“l  where 
F  is  the  response  distribution.  For  the  response  distributions  chosen,  these  values  of 
(F'  (.6))“*  range  between  2  and  2.5.  Thus  the  chosen  values  10,  20  are  much  removed 
from  the  optimum  and  will  act  to  slow  convergence  relative  to  the  optimum.  It  is  in 
such  an  environment,  suboptimal  values  of  c  or  United  prior  knowledge  of  the  response 
distribution,  where  MLE(c,d)  was  shown  to  be  superior  to  RM  type  designs. 

The  results  of  the  simulation  comparison  are  efficiently  represented  in  graphical 
form.  In  Figure  2  we  are  examining  the  relative  magnitude  of  (MSE)1/3  for  nine  sample 
sizes.  The  true  response  distribution  was  normal,  and  the  initial  design  point  was  zero 
as  indicated  by  the  arrow.  To  obtain  the  DRM-10  and  MLE(  10,30)  points  for  each  sam¬ 
ple  size,  the  same  random  number  sequence  was  used  for  both  in  each  iteration  so  that 
any  difference  in  the  quality  of  the  design  points  chosen  was  a  function  of  the  design 
under  these  particular  conditions.  Unless  otherwise  noted,  the  procedures  yield  esti¬ 
mates  which  are,  for  practical  purposes,  unbiased. 

Given  that  the  response  distribution  standard  deviation  is  unity,  the  mild  fluctua¬ 
tions  between  procedures  illustrated  here  are  considered  negligible.  Similar  results  hold 
true  for  the  uniform  and  Cauchy  distributions.  See  Figures  3-4. 

The  disparity  in  precision  among  the  three  distributions  for  small  sample  sizes  is 
believed  to  be  caused  by  the  different  response  distribution  shapes.  The  reasons  for  this 
belief  are  given  in  the  following  discussion.  Since  the  disparity  is  most  noticeable 
between  the  Cauchy  and  the  other  two,  the  discussion  will  focus  on  the  effect  of  the 
heavy  tails  of  the  Cauchy  distribution. 
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Figure  3.  Precision  Under  a  Uniform  Response  Distribution. 
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Figure  4.  Precision  Under  a  Cauchy  Response  Distribution. 


First,  consider  the  estimation  of  the  median  of  the  response  distribution.  Define  a 
wrong  decision  as  moving  away  from  the  median  to  collect  the  next  design  point. 
Wrong  decisions  inflate  the  variance  of  the  maximum  likelihood  estimate.  This  follows 
because  when  the  design  steps  away  from  the  median  it  causes  the  collection  of  data 
holding  less  information  regarding  its  location.  Banerjee  (1080)  shows  this  rigorously  for 
a  normal  response  distribution.  Additionally,  when  a  wrong  decision  occurs  a  larger 

value  of  — ; -  (consistent  with  small  samples)  will  result  in  data  collection  farther 

from  the  median  than  for  a  small  value  of  that  quantity.  In  an  extreme  case  the  design 
may  begin  errantly  sampling  in  the  tail  of  the  response  distribution  and  take  several 
steps  to  return  to  levels  of  stimulus  more  likely  to  yield  useful  information.  Second,  if 
presently  sampling  in  the  tail  of  the  response  distribution,  the  Cauchy  distribution  is 
more  likely  to  cause  a  wrong  decision  than  the  other  two,  If  xn  is  currently  below  the 
true  median  a  wrong  decision  occurs  with  probability  F(xn),  Thus,  for  a  fixed  xn  in  the 
tail  area  of  the  Cauchy  distribution  F(xrJ  is  large  relative  to  corresponding  probabilities 
as  evaluated  for  the  normal  and  uniform  distributions.  Third,  if  few  samples  are  used 
the  importance  of  the  informational  content  of  those  samples  is  accentuated  thus  lead* 
ing  to  the  disparity  mentioned. 

In  Figures  5*8  all  500  iterations  are  represented  in  histogram  form.  The  observa¬ 
tions  are  estimates  of  the  median  by  DRM-10  or  MLE(  10,30)  under  the  Cauchy  response 
distribution  for  a  sample  size  of  15  or  35.  The  arrow  indicates  the  true  median.  As 
expected  after  viewing  Figure  4,  no  substantive  difference  exists  among  the  empirical 
densities, 

In  Figures  0-10  the  exponential  response  distribution  is  considered,  with  results 
similar  to  the  previous  three,  in  terms  of  relative  precision.  However,  ae  Figure  10  illus¬ 
trates,  the  estimates  produced  by  either  method  are  biased  with  DRM-10  arguably  more 
biased  than  MLE(10,30),  V&0,  zero  in  this  case,  denotes  the  mediau  of  the  response  dis¬ 
tribution  associated  with  critical  velocity.  Velocity  is  a  common  stimulus  in  Army  test¬ 
ing.  Each  point  on  Figure  10  represents  the  average  of  500  estimates  of  V60,  The  rea¬ 
sons  for  the  bias  are  similar  to  the  reasons  for  precision  disparity  mentioned  earlier. 
Although  not  displayed,  similar  results  hold  true  for  comparison  of  DRM-20  to 
MLE(20,30)  and  under  the  condition  of  the  initial  desigu  point  equaling  the  median  -  3. 

In  estimating  the  median,  the  results  are  clear,  There  is  virtually  no  difference  in 
precision  between  the  two  procedures  for  a  variety  of  response  distribution  forms.  In 
general  the  designs  must  be  judged  equivalent  in  their  ability  to  gather  pertinent  data 
for  the  estimation  of  the  median,  since  the  estimation  is  accomplished  using  maximum 
likelihood  with  a  logistic  model  for  each  and  the  random  number  sequences  were  identi¬ 
cal  for  each.  The  only  studied  exception  was  that  MLE(c,d)  produced  slightly  less 
biased  estimates  than  did  DRM-c  for  the  exponential  response  distribution  In  this  case 
it  appears  that  MLE(c,d)  gathered  data  in  a  slightly  more  efficient  manner. 

Their  general  equivalence  is  important  to  consider  when  choosing  a  design.  Extend¬ 
ing  the  comparison  to  computational  ease,  DRM-c  is  easier  to  employ  than  is  MLE(c,d) 
in  many  practical  settings.  Prior  to  each  test  DRM-c  requires  of  the  field  experimenter 
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Figure  5.  Empirical  Density  of  the  Estimator  DRM-10. 
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Kijure  7.  Empirical  Density  of  the  Estimator  DRM-10. 
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Figure  8.  Empirical  Density  of  the  Estimator  MLE(  10,30). 


ri|ri,nyCn.ri.i\n.«V 


V.  v  \  •.  ■  ■ .  ■*  V  7  V  V.V  v;a‘V,  r. V.  A 

iiA 


ms 


ri7XrK 


mm 


EXPONENTIAL 


9 

©  © 

KKY 


9  © 


?-S5l£o>  5  pantile 

T - 1 - I - 1 - 1  1  - 

to  is  ao  as  30  m 
SAMPLE  SIZE 


®  ©  ® 


Figure  9.  Precision  Under  an  Exponential  Response  Distribution 
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Figure  10.  Average  of  the  Estimates  Under  an  Exponential  Response  Distribution. 
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only  the  solution  of  &  single  equation  with  one  unknown.  This  is  a  computation  that  cer¬ 
tainly  could  be  performed  by  hand.  Not  until  the  data  are  collected  is  it  necessary  to 
iteratively  solve  for  the  maximum  likelihood  estimates  in  a  more  time  consuming  effort. 
Assuming  that  the  conditions  for  maximum  likelihood  estimation  have  been  met,  use  of 
MLE(c,d)  would  require  that  prior  to  each  sample  taken,  an  iterative  solution  for  the 
estimates  be  performed.  The  potential  difficulties  in  a  field  test  are  obvious. 

At  this  point  we  mention  two  additional  facts  regarding  MLE(c,d).  First,  there  is  a 
suggested  provision  for  delaying  implementation  of  maximum  likelihood  estimation,  Wu 
(1985).  We  denote  this  MLE(c,d,f),  where  /  is  a  lag  delay  after  the  unique  existence  cri¬ 
terion  is  satisfied;  after  which  maximum  likelihood  estimation  is  to  be  employed.  The 
intent  of  this  provision  is  to  delay  use  of  maximum  likelihood  estimation  until  it  is  likely 
that  the  estimates  will  be  more  stable.  This  is  why  we  used  maximum  likelihood  no 
sooner  than  in  the  selection  of  the  7th  design  point.  We  mention  this  for  completeness 
because  /  could  be  chosen  to  be  variable  so  that  maximum  likelihood  estimation  was 
delayed  until  the  last  data  point  was  gathered;  in  which  case  MLE(c,d,f)  reduces  to 
DRM-c  as  defined  in  this  study. 

Second,  although  an  iterative  solution  is  necessary  to  solve  for  the  maximum  likeli¬ 
hood  estimates,  Wu  (1985)  does  suggest  an  approximation  which  would  eliminate  the 
need  for  an  iterative  solution.  The  approximation  is  valid  if  design  points  are  close  to 
the  quantile  being  estimated.  Caution  is  warranted  when  using  this  approximation  in  a 
small  sample  test  environment  with  no  prior  knowledge  of  the  response  distribution. 
There,  closeness  of  the  design  points  to  the  quantile  of  interest  cannot  be  assured. 

Thus  far  only  the  median  has  been  considered.  It  is  certainly  possible,  and  in  many 
situations  more  desirable,  to  estimate  quantiles  other  than  the  median.  The  median  is 
the  quantile  commonly  used  for  inference  primarily  because  it  is  the  easiest  to  estimate. 
We  also  compared  the  two  procedures  for  estimating  the  3rd  quartile.  In  practice,  for 
estimating  quantiles  beyond  the  first  or  third  quartile,  specific  extreme  value  designs 
may  be  more  practical. 

Figure  11  shows  the  precision  of  the  two  procedures  when  estimating  the  3rd  quar¬ 
tile  of  the  normal  distribution.  Once  again,  any  differences  between  the  two  methods 
appears  negligible.  The  procedures  appear  to  be  biased  in  estimating  the  3rd  quartile 
for  small  sample  sizes.  In  Figure  12  the  ordinate  is  now  averaged  estimates  of  the  3rd 
quartile.  The  arrow  represents  the  true  quantile  value,  .675. 

Figures  13-14  concern  estimation  of  the  3rd  quartile  of  the  Cauchy  response  distri¬ 
bution.  Remember  that  the  normal  and  Cauchy  response  distributions  were  chosen  to 
have  the  same  quartiles.  Thus  by  comparison  we  see  that  the  precision  of  the  methods 
is  much  worse  for  the  heavier  tailed  distribution.  It  does  appear  that  MLE(  10,30)  tends 
to  be  more  precise  and  less  biased  than  DRM-10  for  larger  sample  sizes. 

Figures  15-16  constitute  our  cursory  look  at  sample  size  determination.  Our 
approach  was  to  indicate  the  best  and  worst  precision  for  each  method  for  the  different 
sample  sizes.  The  extreme  precisions  were  extracted  from  the  performance  of  the  pro¬ 
cedures  under  the  four  response  distributions.  Initial  design  point  selection  and 
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magnitude  of  the  constant  ■:  were  not  considered,  since  we  are  primarily  interested  in 
larger  samples  where  they  have  no  noticeable  effect.  Another  common  problem, 
stimulus  noise,  was  not  considered.  Bodt  and  Tingey  (1087)  show  maximum  likelihood 
estimation  under  a  similar  normal  response  to  be  insensitn  1  to  stimulus  noise. 

Figure  IS  clearly  indicates  the  range  of  precision  gained  from  additional  samples. 
Figure  16  compares  sample  sizes  necessary  for  estimating  the  median  and  third  quartile 
of  the  response  distributions  with  approximately  the  same  precision.  Note  the  cost 
effectiveness  of  using  the  median  if  it  can  serve  as  a  reasonable  point  for  inference  in  the 
application  at  hand. 
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Figure  16.  Sample  Sizes  Required  for  Estimating  the  Median  and  Third 
Quartile  with  Approximately  the  Same  Precision. 


IV.  SUMMARY 

Under  suboptimal  conditions  for  the  stochastic  approximation  method,  DRM-c’s 
ability  to  collect  data  pertinent  to  the  estimation  of  the  median  of  the  response  distribu- 
tion,  was  comparable  to  that  of  MLE(c,d).  This  was  true  over  a  variety  of  response  dis¬ 
tribution  shapes  and  sample  sizes.  For  the  estimation  of  the  3rd  quartile  they  were 
again  comparable  when  the  response  distribution  was  normal.  If  the  response  followed  a 
Cauchy  distribution,  MLE(c,d)  was  slightly  superior  to  DRM-c.  The  experimenter  is 
encouraged  to  take  into  account  these  findings  when  planning  a  sensitivity  test. 
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TESTS  FOR  CONSISTENCY  OF  VULNERABILITY  MODELS 


This  paper  studies  the  problem  of  confirming  a  set  of  estimated  probabilities  of  kill 
for  a  small  number  of  independent,  but  not  identically  distributed,  Bernoulli  outcomes. 
The  problem  originates  from  vulnerability  studies  on  tanks  in  which  kill  probabilities  of 
individual  components  are  desired.  The  cost  of  resources  makes  it  unfeasible  to  obtain 
these  probability  estimates  by  repeated  field  testing.  Therefore,  computer  simulation  is 
used  to  get  the  desired  estimates.  Researchers  then  want  to  test  the  accuracy  of  these 
computer  generated  values.  Again,  the  economics  of  live  firing  sometimes  allows  for  the 
tiring  of  only  one  round  at  the  tank.  The  question  becomes  "Can  the  kill  probabilities 
obtained  through  simulation  be  confirmed  by  the  results  of  a  single  round  field  test?" 

Denote  the  simulation  estimates  by  the  vector  [pj°,  P2°r-'tPk°)  where  p°  is  the  pro- 
bability  of  kill  for  the  ith  tank  component  of  interest.  If  we  assume  that  the  com¬ 
ponents  are  independent,  we  may  rewrite  the  above  question  in  the  form  of  a  hypothesis 
test. 

H0:  p,  -  Pi°,  p2  «=»  P2° . Pk  —  Pk°.  v»' 

H4:  pi  Pi°,  for  some  i . 


Note  that  when  pj°  ■=  p2°  —  ...  —  Pk°  —  P°.  this  is  the  k-trial  binomial  case, 
B(k,p°),  and  the  null  hypothesis  is  simply  H0:  p  —  p°.  We  seek  a  test  for  the  more  gen¬ 
eralized  case  of  unequal  pi's.  As  may  be  expected,  the  small  size  of  k  will  present  prob¬ 
lems  with  power.  Also,  it  should  be  pointed  out  that  the  alternative  hypotheses  only 
says  that  at  least  one  inequality  exists.  In  practice  though,  it  will  usually  take  several 
gross  differences  between  the  hypothesized  and  actual  vectors  for  any  test  to  reject  H0. 
Therefore  the  tests  to  be  explored  will  not  be  able  to  validate  the  hypothesized  probabil¬ 
ities;  they  will  be  able  to  check  for  consistency  between  the  simulation  estimates  and 
field  test  results  as  a  whole. 


Suppose  we  observe  a  set  of  k  independent  0  or  1  outcomes  (representing  survive  or 
kill),  denoted  by  the  row  vector  A  •-=  jaj,  ....  ak).  For  example,  if  k  =  5,  we  may 
have  A  =»  [0,  1,  0,  0,  1).  There  are  lr  possible  outcome  vectors.  The  probability  of 
observing  outcome  vector  A  under  the  null  hypothesis  is  given  by  the  density  function 


P(A)  =  p/1  •  (1  -  p,”)1'  •  p2‘‘‘<!  -  t,i<  • ...  ■  p/1  •  (I  -  Pk”)1'-**1 


n  pi”"' •  (i -Pi": 

i««  1 


A  test  of  the  null  hypothesis  needs  some  way  of  ordering  the  2k  possible  outcome  vec 
tors.  We  will  examine  three  test  procedures  characterized  by  their  ordering  schemes. 


Twt  One 


This  test  rejects  the  null  hypothesis  if  the  observed  vector  is  among  some 
predefined  critical  set  of  "rarest”  outcomes.  The  outcome  set  is  ordered  by  the  density 
function  in  increasing  magnitude,  and  each  outcome  is  numbered  so  that  At  is  the  least 
likely  to  occur  and  A^  is  the  most  likely.  Define  a  "cumulative  function”  B,  whereby 

JP(A,)  i  =  l 

B|  "  lB,.,  +  P(A,)  i  „  J,  3 . 2k  . 

Choosing  a  ”c”  such  that 

c  =  max  { j  |  Bj  <  a  and  P(Aj)  /  P(Aj+1)}, 

then  the  set 


Arr  {Aj,  A2,  ...,  A,} 


represents  the  c  rarest  outcomes  and  defines  the  rejection  region  for  test  of  H0  with  a 


(o)100%  level  of  significance. 
Arr,  then  H0  is  rejected. 


The  "test  statistic"  is  the  observed  vector  A;  if  it  is  in 


Ittilaa 


This  test  is  based  upon  the  number  of  kills  observed.  The  underlying  notion  is  that 
under  the  hypothesized  model,  a  certain  number  of  kills  is  expected.  Letting  K(A)  be 
the  number  of  observed  kills,  then  the  expected  value  of  K(A)  under  the  null  hypothesis 

is 


E[K(A)1 


Pi 

k 


+  p2°  +  ...  +  pk° 


£ 

1-1 


pi 


If  the  observed  K(A)  is  much  smaller  than  this  value,  then  perhaps  the  simulation 
overestimated  the  kill  probabilities  and  H0  should  bo  rejected.  On  the  other  hand,  if  the 
observed  K(A)  is  much  larger,  (hen  H0  should  be  rejected  since  the  kill  probabilities  may 
be  underestimated. 


To  perform  this  test,  we  begin  by  calculating  P(A)  and  K(A)  for  all  2k  outcomes. 
The  outcomes  are  then  ordered  by  increasing  magnitude  by  the  number  of  kills  and 
numbered  so  that 


K(A,)  <  K(A2)  <  ...  <  K(A2k). 


(The  order  among  outcomes  with  the  same  K(A)  is  irrelevant.)  Similarly  to  Test  One, 
the  "cumulative  function”  is  calculated.  Since  rejecting  H0  may  be  the  result  of  too 


large  or  too  small  a  value  of  K(A),  a  two-tailed  test  is  used, 
are  selected  so  that  the  actual  alpha  level 


Critical  values  c,  and  e, 


319 


PJK(A)  <  c,]  +  P[K(A)  >  c2|  (*) 

is  maximized  but  still  less  than  or  equal  to  a.  The  rejection  region  for  Test  Two  is 
K(ARR)  =  {0,  1,  c,}  U  {  c2,  c2  +  1,  k}. 


The  simulation  generated  estimates  will  be  rejected  as  inconsistent  with  the  field 
test  if  K(A)  €  K(Arr). 

Test  Thjr.se 

This  test  examines  the  number  of  "correct  responses”,  where  a  correct  response  is 
defined  as: 

11,  if  aj  =®  0  when  Pj°  <  .5,  or  aj  =  1  when  p®  >  .5 
.5,  if  pj°  =  .5 
0,  otherwise. 

Therefore,  a  correct  response  of  1  is  given  if  the  more  likely  outcome  (kill  or  survive)  is 
observed.  On  the  other  hand  a  correct  response  of  0  means  that  the  less  likely  event 
occurred.  As  somewhat  of  a  tiebreaking  policy,  if  the  hypothesized  probability  is  .5, 
then  the  correct  response  value  is  .5,  no  matter  which  outcome  is  actually  observed. 

The  test  statistic  is  C(A)  =  'll  +  72  +  •  •  •  +  % 

k 

-  E 

i-l 

The  expected  value  of  the  test  statistic  is 
E(C(A)|  =  Cl  +  k*/2  +  Cy, 


where 


CL  =  EU  -  Pj°)  for  a11  Pj°  <  5 


Cu  =  E  Pi' 


for  all  pj°  >  .5 


k*  =  "number  of  pj°  equal  to  .5”  . 

The  test  procedure  begins  by  calculating  P(A)  and  C(A)  for  all  possible  outcomes 
The  outcomes  are  arranged  by  increasing  magnitude  by  the  number  of  correct  responses 
without  regard  for  ties  so  that 


I 

Kgi 


P|K(A)  <  c,]  =  Bit  for  i  =  max  {j  |  K(Aj)  <  K(Aj+1)  and  K(A;)  =  c,} 
P(K(A)  >  c2|  =  1  -  Bj,  for  i  =  min  {j  |  K(Aj)  >  K(Aj_,)  and  K(Aj)  =  c2} 


C(A,)  <  C(A2)  <  ...  <  C(A*). 

The  cumulative  density  is  computed  as  usual.  Observing  a  value  of  C(A)  much  smaller 
or  larger  than  the  expected  value  leads  us  to  believe  that  H,,  is  false.  Therefore,  a  two- 
tailed  test  is  desired,  and  the  critical  values  c,  and  c2  &pe  chosen  to  maximize 

P|C(A)  <  c,]  +  P[C(A)  >  c2)  <  a, 

the  actual  alpha  level,  where 

P[C(A)  <  ct]  =s  B|,  for  i  =  max{j  |  C(AJ  <  C(Aj+1)  and  CfAJ  =  c,} 

P[C(A)  >  c2]  =  1  -  B;,  for  i  =  min  {j  |  C(AJ  >  C(Aj_,)  and  C(AJ  =  c2}. 

Since  the  rejection  region  is  CRR  =  {0,  1,  ...,  Cj}  U  {c2,  c2  4*  1,  ...,  k),  we  will  reject  H0 

at  the  o  level  of  significance  if  C(A)  €  CRR. 


To  study  the  three  test  procedures,  2000  pairs  of  k-dimensional  probability  vectors 
were  randomly  generated  for  k=8,...,10.  The  first  vector  of  a  pair  (P0,  PJ  was  con¬ 
sidered  the  hypothesized  probability  vector,  and  the  second  was  the  alternative  proba¬ 
bility  vector.  The  level  of  significance  was  set  at  o  =  .05.  The  power  of  the  three  tests 
was  computed  for  each  pair  (P0,  PJ. 

Figures  1  through  3  show  a  graphical  way  of  comparing  power  between  any  two 
tests,  A  and  B.  Each  point  represents  a  pair  (P0,  PJ.  Its  coordinates  (x,y)  are  the 
power  of  Tests  A  and  B,  respectively.  If  Test  A  is  more  powerful  than  Test  B,  then  we 
expect  to  see  a  graph  similar  to  Figure  1.  If  the  opposite  is  true,  the  graph  will  be  simi¬ 
lar  to  Figure  2.  But  if  both  have  approximately  the  same  power  then  Figure  3  is  the 
proper  scatterplot. 

Comparison  of  the  three  tests  based  upon  the  2000  randomly  generated  vectors  is 
shown  in  Figures  4-8.  Several  observations  can  be  made  from  these  graphs. 

1.  For  most  pars  of  vectors,  and  for  k=6,7,8,9,10,  Test  1  has  greater  power  than 
either  of  the  ether  two  tests, 

2.  Median  power  increases  with  k  for  Tests  1  and  3  (see  Figure  9). 

3.  Median  power  remains  fairly  constant  for  all  k  with  Test  2.  The  median  power 
of  Test  2  is  not  much  greater  than  the  alpha  level.  This  indicates  how  poor  a 
procedure  the  tost  is. 

When  comparing  the  power  of  all  three  tests  for  each  point,  it  was  occasionally 
found  that  the  superior  test  was  either  Test  2  or  Test  3.  For  example, 


»**/«* 


iVi 

>S>v 

1  la*  9 

iVr* 

»N  i 


y.y 


ft. 


:.*,y 


BA* 


& 


St 


5® 


fl,,  =  (.71  .23  .10  .09  .15  .67  .50  .93] 
fl4  «  (.80  .36  .47  .34  .36  .27  .94  .95] 


TEST 

1  2 

Rej.  region 

187  least  likely 

© 

00 

Exact  alpha 

.0499 

.0495 

Power 

.3631 

.7728 

{.5, 1.5, 2.5, 3.5} 


.7728 


This  leads  us  to  ask  what  can  be  determined  from  (P0)  PJ  about  the  power  of  the 
tests,  if  anything?  One  possible  relationship  studied  was  the  power  versus  the  distance 
between  P0  and  Pt  in  k-space,  i.e.  A  (P0,  PJ  where 

A  (P„  P.)  -  y/ti  (Pi°  -  Pi*)* 

=  \/(P|"  *  P|‘)J  +  ..  +  (Pk°  -  Pk*)J 


Figures  10-14  show  scatterplots  of  this  relationship  for  each  test  and  sample  size. 
The  correlations  between  power  and  A  (P0,  P  J  are  shown  in  Figure  15. 

The  problem  with  looking  at  the  relationship  between  P0  and  P„  however,  is  that 
in  practice  we  do  not  know  what  P,  is.  It  will  not  be  very  helpful  to  know  that  the 
choice  of  best  test  for  a  given  P0,  is  dependent  upon  the  choice  of  Pr  We  should  look 
for  a  best  test  given  P0  only.  This  is  the  topic  of  ongoing  research. 


SUMMARY 

The  problem  is  most  complicated  by  the  fact  that  we  must  judge  the  entire  set  of 
computer  generated  estimates  on  a  single  fired  shot.  While  we  admit  that  Test  1  was 
not  able  to  detect  some  greatly  differing  alternative  t?t  of  probabilities,  it  was  in  general 
the  best  of  the  three  test  procedures.  The  reasons  become  obvious  when  we  closely 
examine  the  other  two. 

Test  2  does  not  take  into  consideration  the  order  in  which  the  ai's  appear.  For 
example,  let  our  hypothesized  set  of  probabilities  be  P0  =  (.01,  .02,  .03,  .97,  .98,  .99], 
For  the  observed  outcome  vectors  A!  =  (0,  0,  0,  1,  l,  1]  and  Aq  =  (1,  1,  1,  0,  0,  0],  we 
compute  P(AJ  *=  .8857  and  P(A2)  =  .000000000036.  However  for  both  outcomes  we 
compute  K(AJ  =  K(A2)  =  3,  the  expected  value  of  the  test  statistic  under  H0.  There¬ 
fore  we  would  not  reject  H0  in  either  case.  Not  only  does  Test  2  not  reject  H0  given  Aa 
(when  it  obviously  should),  but  it  has  managed  to  equate  the  most  likely  and  least  likely 
outcomes. 


Test  3  does  consider  the  order  of  the  observed  outcomes,  however  it  does  not  incor¬ 
porate  the  magnitude  of  the  p^’s.  To  see  how  this  is  dangerous,  let  P0,  =  [.S3,  .52,  .51, 
.40,  .48,  .47],  Po3  fM  (.47,  .48,  .40,  .61,  .52,  .63],  and  A  -  [1,  1,  l,  0,  0,  0).  Under  H0l, 
P(A)*.010756  and  C(A)s=6,  while  under  Ho2,  P(A)a=  012220  and  C(A)s=0.  The  test  has 
exaggerated  the  difference  between  the  two  probability  vectors,  despite  their  being 
nearly  equal. 

Test  1  is  the  best  of  the  three  candidate  procedures  because  it  simply  tries  to  create 
the  largest  possible  rejection  region.  Imagine  trying  to  fill  a  fishbowl  with  as  many  mar¬ 
bles  as  possible  when  the  marbles  are  different  sizes  Since  we  do  not  want  to  take  up 
space  with  larger  marbles,  we  fill  the  fishbowl  one  marble  at  a  time  starting  with  the 
smallest,  then  the  second  smallest,  and  so  on  until  the  bowl  is  full.  In  a  similar  fashion, 
this  is  how  th'e  rejection  regiou  for  Test  1  is  formed,  thus  resulting  in  a  most  powerful 
test. 

Further  research  into  this  problem  will  look  at  other  possible  tests  and  easier  imple¬ 
mentation  of  Test  1. 


Figure  4. 

Test  1  vs.  Tests  2  and  3  (K=6). 


Figure  6. 

Test  1  vs.  Tests  2  and  3  (K=7) 
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Figure  12.  Power  vs.  A  (P0,  P,)  (K=8). 


NONPAR AHETR I C  SMALL  SAMPLE  TOLERANCE  LIMITS 


Dor, aid  M.  Neal,  Mark  6,  Vangel  ,  and  John  Reardon 

Materials  Technology  tabora'.ory 
Watertown,  MA  02172-0001 

ABSTRACT 


Results  from  this  clinical  study  have  Identified  the  Hanson- 
Koopmans*  method  as  the  most  desirable  nonparametrl c  small  sample  lower 
tolerance  limit  estimator  In  the  range  where  conventional  nonparamet r 1 c 
procedures  are  not  defined.  The  Monte  Carlo  studies  Indicated  that 
this  method  worked  well  for  sample  sizes  from  2  to  28.  The  authors' 
Initial  effort  rslng  a  linear  function  of  the  first  four  order 
statistics  was  reasonably  effective  for  sample  sizes  greater  than  15. 

Other  efforts  to  obtain  a  solution  to  the  problem  Include 
extension  of  the  quantile  sign  test,  a  scheme  involving  a  reduction 
factor  for  the  first  order  value,  and  a  smooth  nonparametr ! c  quantile 
estimator.  These  methods  were  not  sat .« factory  due  to  either 
Instability  of  first  ordered  value  when  sample  sizes  are  small,  or  the 
Inability  to  provide  proper  cover  ge  rate  for  N  <  28. 

INTRODUCTION 


The  Inability  to  obtain  exactly  the  same  structural  properties 
from  all  specimens  obtained  from  a  manufactured  material  results  In  a 
relatively  large  variability  In  strength  measurements  when  a  large 
number  of  specimens  are  considered.  In  the  case  of  designing  an 
aircraft  structure.  It  Is  required  to  design  such  that  a  maximum  stress 
value  exists  In  critical  locations,  and  these  values  do  not  exceed  the 
minimum  guaranteed  material  properties  (strength).  Obtaining  minimum 
strength  values  will  reduce  the  possibility  of  some  production 
components  containing  weaker  material  than  that  from  the  laboratory 
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test  element.  This  guaranteed  minimum  strength  value  Is  defined  as  the 
design  allowable  (basis  value)  by  aircraft  design  engineers. 

Usually,  the  measured  value  Is  considered  acceptable  in  estimating 
the  population  parameters  for  predicting  population  percentiles.  In 
the  case  of  the  design  engineer,  It  Is  advisable  to  have  a  prediction 
which  will  determine  the  accuracy  of  the  percentile  estimate  at  a  high 
degree  of  statistical  confidence.  This  Is  the  correct  Interpretation 
of  a  basis  value.  For  example,  certain  military  standards,  e.g.,  mil- 
HDBK-5^  require  material  property  data  to  be  presented  on  an  A  or  B 
allowable  basis.  The  allowables  represent  a  value  determined  from  a 
specified  probability  of  survival  with  a  95  percent  confidence  In  the 
assertion.  The  survival  probabilities  are  .99  for  the  A  allowable,  and 
.90  for  the  B  allowable. 

MTl  Is  Involved  In  the  development  of  the  statistics  chapter  for 
the  MIL-17  Handbook'*  on  composite  material  In  aircraft  structural 


design.  The  chapter  will  Include  methods  for  determining  the  design 
allowable  values.  The  Inability  to  Identify  the  statistical  model  from 
limited  or  multi-modal  data  motivated  the  authors  to  find  a  non- 
parametrlc  model  which  will  provide  a  correct  tolerance  bound  (P».95) 


on  the  quantile  values  (P-.10),  The  conventional  nonparamet rl c  method 


using  the  quantile  sign  test  provides  a  solution  If  there  are  at  least 
28  values  In  the  sample.  Unfortunately,  the  model  needed  Is  one  for 
sample  si zes  less  than  28. 

This  paper  presents  the  results  of  a  clinical  paper  presented  at 
the  ARO  sponsored  Thirty-Sixth  Annual  Design  of  Experiments  Conference 
on  methods  fcr  obtaining  an  accurate  measure  of  the  above  mentioned 
design  allowables  Involving  small  sample  nonparametrl c  modeling.  It 
should  be  noted  that  there  are  difficulties  In  extreme  quantile 
modeling  techniques  Involving  determination  of  tolerance  bounds  for  the 
quantile  values  In  the  allowable  computation.  Brleman,  stone  and  Glris^ 
have  discussed  the  difficulties  existing  In  model  Identification  when 
very  small  tall  probabilities  are  required.  This  Is  the  result  of 
parameter  estimates  that  usually  are  obtained  from  data  In  the  central 
portion  of  the  distribution,  where  most  failures  occur,  leaving  the 
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tall  region  limited  In  representation.  This  Is  unfortunate,  since  the 
relatively  small  amount  of  data  In  the  tall  region  Is  of  prime 

Importance  to  the  allowable  computation.  The  nonparametrl c  scheme  can 
model  the  lower  ordered  values  of  the  distribution.  The  Hanson- 
Koopmant*  model  Is  recommended  as  a  solution  to  the  nonparametrl c  small 
sample  tolerance  limit  when  considering  the  various  alternative 

solutions,  the  application  of  the  method  does  not  result  In  overly 
conservative  estimates  of  the  allowable  values.  Other  methods  were 
attempted,  Inc’udlng  an  extension  of  the  quantile  sign  test,  lln<?»r 
function  of  first  four  order  statistic  (authors'  proposed  method),  ? 
smooth  nonparametrl c  quantile  estimator0,  and  an  adaptive  scheme 

Involving  simulation  procedures  for  obtaining  ratio  of  the  first  order 
value  to  the  allowable  value.  None  of  the  above  methods  were 

acceptable  for  the  sample  size  requirements  of  2  £  n  £  28,  due  to 
either  computational  problems  or  Inability  to  provide  minimum  coverage 
of  95%. 


QUANTILE  ESTIMATE  -  SAMPLE  SI2E 


The  Importance  of  determining  a  tolerance  limit  on  the  quantile 
values  1$  graphically  displayed  In  Figures  la  and  lb.  The  standard 
normal  distribution  function  Is  plotted  for  sample  sizes  of  50  and  10, 
using  25  sets  of  data.  In  figure  la,  N  ■  50  the  amount  of  spread  in 
quantile  for  the  10  percentile  values  In  ,80,  Figure  lb  shows  a  spread 
of  2. A  for  the  same  percentile.  This  example  shows  the  Importance  of 
having  large  sample  sizes,  or  otherwise  providing  a  tolerance  limit  on 
the  quantile  estimate. 

Often  In  structural  design,  a  criteria  requires  material  property 
values  to  be  larger  than  the  design  stress  In  order  to  define  the 
margin  of  safety.  Determining  a  property  value  from  10  material 
strength  tests  In  order  to  obtain  90%  reliability  estimates  could 
result  In  nonconservative  values  and  possible  structural  failure. 
Obtaining  a  lower  95%  confidence  bound  on  the  reliability  estimate  can 
provide  the  necessary  assurance. 
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DEFINITION  OF  THE  B-BASIS  VALUE 


Th«  B-basIs  value  1  *  a  random  variable  where  an  observed  basis 
value  (design  allowable)  from  a  sample  will  be  less  than  the  10 
percentile  of  the  population  with  a  probability  of  .95.  In  figures  2a 
and  2b,  a  graphical  display  Is  shown  for  the  basis  value  probability 
density  function  (N(0,1))  for  sample  sizes  of  n  ■  10  and  50.  The 
dotted  vertical  lines  represent  the  location  for  the  10  percentile 
(X  1Q)  of  the  population  and  the  probability  (basis  value  <  X  10)  ■  95 
for  the  basis  value  probability  density  function.  The  graphical 
display  of  the  basis  value  density  functions  show  much  less  dispersion 
for  n  •  50  than  for  n  ■  10.  Small  sample  sizes  will  result  In  more 
conservative  estimates  of  the  basis  values. 
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QUANTILE  SIGN  TEST  (Conventional  Nonparametrl c  Analysis) 


A 

The  quantile  sign  test  Is  Introduced  In  the  text  as  a  procedure 
that  provides  an  accurate  B-basIs  value  for  n  >  28.  The  authors 
Initially  attempted  to  extend  this  method  for  n  £  28  using  various 
procedures  related  to  the  first  ordered  value  without  success. 

The  analysis  Involves  considering,  for  example,  q  ^  is  a  quantile 
of  a  distribution,  then  the  values  <  q  are  binomial  random  variables 
with  n  trials  and  probability  of  .10.  If  X^  Is  the  rth  ordered  v a  1  :j e 
In  the  sample,  the  B-basIs  value  Is  equal  to  X^rj  where  r  >_  1  1  s  the 
largest  Integer  solution  to 


£  (« 


,  \*  / 

(.10)  90)  2  .95 


where 


»(w)sn!/w!(n’w)  ! 

and  n  ■  sample  size 


See  Table  I  for  computing  values  for  r  given  n. 


tm 
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’•V* 


NORMAL  PROBABILITY  DENSITY  FUNCTION 
MEAN  »  0.0 

STANDARD  DEVIATION  ■  1.0 


TABLE  I 


Ranks  (r)  of  Observations  (n)  for  determining 
B-Basls  Values  for  an  Unknown  Distribution 


<29 

129 

8 

227 

16 

29 

1 

142 

9 

239 

17 

46 

2 

154 

10 

251 

IB 

61 

3 

167 

11 

263 

19 

76 

4 

179 

12 

275 

20 

89 

5 

191 

13 

298 

22 

103 

6 

203 

14 

321 

24 

116 

7 

215 

15 

345 

26 

EXTENDED  QUANTILE  SIGN  TEST 


The  extended  quantile  sign  test  was  developed  In  order  to  obtain 
the  B-basIs  values  for  n  <  29  using  nonparametrl c  procedures. 

Let  n  be  a  fixed  value  less  than  29.  Calculate  the  probability 
values  Pj  ,  ♦  •  ••  Pk  as  ^ol1ows:  1  <,  j  <.  k ,  and  Pj  Is  the  solution 


.05  -  (1  -  P4)n  ♦  /'n\(l  -  P.)1 


'j>  4  (?)(1  ‘  pj 


+  (5) 


<>-V 


*  •  •  •  (?) 


(i  -  pjn-jp,j 


where  k  <  <  n. 


Example:  If  .n  ■  15,  let  k  ■  3,  then  P  ■  ,1B1,  ■  ,280  and 

P3  •  .364  ,  with  corresponding  order  statistics  X^,  *(2)’  ar,(<  x(3)’ 
( see  Figure  3)  , 
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rr.,Y 


SB 


The  following  Interpolation  models  were  applied  In  order  to  obtain  XR 
for  P  ■  .10: 


X  -  A  Log  (BP  ♦  1), 


(3) 


X  ■  A  Log  (PB  +  C)  given  P  ■  .0001  and  XQ  ■  0.0,  (4) 
and  x  ■  AP2  ♦  BP  +  C  given  P  ■  .0001,  XQ  ■  0.0.  (5) 

The  model  In  equation  (3)  failed  to  provide  acceptable 
Interpolation  results  because  of  Its  Inability  to  represent  (PQ,  XQ), 
(Pj,  (P2,  X  g ) »  and  (P3,  X3)  effectively. 

The  models  from  equations  (4)  and  (5)  provided  adequate 
Interpolation  results  for  P  ■  .10.  The  computation  procedures  for 
obtaining  each  B>bas1s  value  from  sample  requires  either  linear 
(Equation  5)  or  nonlinear  (Equation  4)  regression  models.  These  are 
not  simple  computational  methods  when  compared  to  conventional  quantile 
sign  test  application.  The  authors  applied  a  simulation  process  with  a 
given  n  value  and  N(0,a)  models  to  approximate  probability  density 
function  for 

F  ■  B-basI s  ( 6  ) 

X(l) 
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The  purpose  was  to  obtain  a  reduction  factor  F  for  the  first  ordered 
value  In  order  to  obtain  the  basis  value  for  given  sample  size  n.  This 
Is  a  similar  computational  procedure,  as  In  the  conventional 
nonparametrl c  method. 

A  schematic  of  the  results  are  shown  below: 


M 


FIGURE  4 


The  substantial  spread  In  the  N{0,5)  case  for  F  was  primarily  the 
result  of  unstable  first  ordered  values  for  the  simulation.  The 
authors  have  also  rejected  the  above  approach  since  It  requires 
Individual  regression  modeling  for  each  sample  If  an  accurate  basis 


number  Is  to  be  obtained. 


B-BASIS  VALUES  FROM  FIRST  FOUR  ORDERED  STATISTICS 


Random  selection  of  R 


This  method  Involves  determining  a  linear  relationship  between  the 
weighted  gaps  of  the  first  four  ordered  values  (*()).  *(?)*  x(3)»  and 
and  the  standard  deviation  s  of  a  normal  population  n(20,s). 
Random  selection  of  R  ■  5000  samples  .f  s'ze  n  were  obtained  for 

selected  integer  values  1  <,  S  5 ,  A  95t  tolerance  limit  is  obtained 

for  the  10  percentile  of  the  N  ^ ( 2  0 , S )  distributions.  This  value 

X  ,  q  represents  the  value  where  95t  of  all  values  are  <  X  (in 

percentile  of  distribution)  from  the  random  sample  of  size  n.  This 


r  I  r  i . 

X  jq  represents  the  value  where  95t  of  all  values  ar 

perc'mtlle  of  distribution)  from  the  random  sample  of  si 
[«  10  95  va^u#  Approximates  the  B-oasIs  value  for  N^(20,S) 


k’  r 

»|V, 


as.® 


Walbull  and  normal  distribution 


HmO*.KOO»HA*S  TOLEMWCE  11H I T S 


The  Hanson-Koopmans1  method  provided  the  most  satisfactory  results 
for  obtaining  small  sample  nonpar amet rl c  B-basls  values.  The  method 
Involves  applying  the  following  equation: 


?  value  •  *(k4j4i)'  cr/\k*j*n*  *(k4l)) 


/  ]  n  \ 


where  x(2)'  x(3)»  4  •  •*  *(n)  ir#  opdered  statistic  values. 


and  C„  Is  obtained  from  solution  of 
n 


.95  . 


■  P  *  1  pt/Cl>  y,Cn->>/c, 

l U  'll 


(11) 


P  -0 


.W^tv  -  w ) ^  1 ( 1  -  v)n  “  k  *  i  *  *dwdv 


where  J  £  1  and  k  ±  0.  The  region  of  Integration  by  the  lines  w«0  for 
0  <  v  <  1,  vl  for  0  £  w  ^  i,  the  line  w»v  for  0  <*  v  p  ana  by  the 


curve 


1/C  (C  -1)/C 
■  p  '  n  v'  n  "  i 


w  ■  p 


for  p  <  v  <  1, 


The  Cn  values  for  equation  10  are  tabulated  In  Table  11  w  h  e  r  ,* 
J  •  3  and  k  •  0  to  obtain  basis  values  for  3  <  n  <_  30,  as  In  the 
f ol 1  owl ng  manner : 

B  value  •  l4  -  Cn(*4  -  *>). 


I1  sin;  the  first  a r-  c  fourth  ordered  values  provided  acceptable  r  e  s  u 1 1  s , 
although  another  combination  could  have  provided  a  better  approximation 
to  the  desired  coverage  rate  of  951.  In  the  case  where  3  _>  n  >  1,  the 
following  equations  were  obtained: 


B  val ue  •  *  * j ) . 

whart  C?  •  35.2  #nd  C j  •  28.8, 
for  n  •  2  and  3  respectively. 


A  Comparison 


(LOSM) . 


Hanson-Kooomans  (MX 


Inoar  Order  Statistic  Model 


TABLE  11 


4  4  .505 

5  4.101 

6  3.765 

7  3  .478 

8  3.229 

9  3  .009 
10  2.612 


n 

Cn 

11 

2.635 

12 

2.474 

13 

2.327 

14 

2.192 

15 

2.068 

16 

1 .952 

17 

1.845 

18  1.745 

19  1.651 

20  1.564 

21  1.482 

22  1  .404 

23  1.331 

24  1  .263 


25  1,198 

26  1.136 

27  1.078 

28  1,027 

29  .971 

30  .811 


A  Monte  Carlo  study  was  completed  Involving  a  comparison  of 
coverage  rates  obtained  from  HK  and  LOSM,  where  the  Welbull  W(<*,fl)  and 
Normal  N(jj,o)  were  the  selected  probability  density  functions.  In  the 
simulation,  tho  confidence  coefficient  (coverage  rate)  was  obtained 
from  determining  the  percentage  of  replications  for  which  the  B  value 
was  lets  than  the  actual  10th  percentile  of  the  distributions.  5,000 
replications  were  used  In  the  experiment.  A  minimum  percent  of  95  Is 
required.  Percentage  slightly  greater  than  95  Is  also  desirable. 

In  Tables  III  and  IV,  the  coverage  rate's  percent  Is  tabulated  for 
both  the  Linear  Order  Statistic  Method  and  the  Hanson-Koopmans  Method 
where  the  normal  and  Weibull  models  are  used  In  the  simulation  process. 
In  Table  III,  a  range  of  standard  deviations  are  considered  In  order  to 
examine  for  the  effects  of  dispersion  In  the  data.  LOSM  results  show 
poor  coverage  rates  when  n  ■  10,  and  acceptable  coverage  rates  for  n«15 
and  10  >,  o  .  The  Hanson-Koopmans  results  show  universal  acceptance 
except  for  marginal  acceptability  for  n  ■  15.  The  authors  also 
obtained  results  for  n  ■  14,  15,  16,  17,  18,  and  25  for  the  HK  method. 
In  all  cases,  coverage  rates  of  at  least  .95  were  obtained,  Indicating 
that  the  lowest  values  are  for  n  «  15.  A  'Ifferent  set  of  ordered 
values  could  possibly  Increase  coverage  values  for  n  >  15.  Results 
from  the  table  Indicate  an  optimization  process  could  be  developed 
where  a  set  of  ordered  values  would  be  determined  to  provide  the 
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minimum  acceptable  coverage  (.95)  depending  on  sample  size  n.  This 
would  prevent  the  over  conservatl vl sm  shown  In  the  tables  (e.g.  .99, 
.98,  .97  coverago  rate) . 

In  Table  IV,  a  range  of  0  values  (shape  parameter)  for  Welbull 
functions,  are  used  for  examining  effects  of  dispersion.  Again,  the 
LOSM  results  show  poor  coverage  rates  when  n  «  10.  The  case  when 
N  ■  15  shows  reason  values  since  .93  Is  the  lowest  value. 

The  Hanson-Koopmans  results  are  similar  to  those  shown  In  Table 
MI.  The  minimum  values  are  at  n  '  15,  which  also  occurred  when  the 
normal  model  was  used  in  the  simulation  process. 

It  can  be  Inferred  from  the  above  results  that  the  HK  method  Is  a 
desirable  nonparametrlc  procedure  for  obtaining  B*bas1s  values  when 
n  £  28.  It  Is  not  clear  why  the  reduction  In  coverage  to  .94  exists 
for  n  ■  15,  while  n  ■  2  and  n  •  30  have  a  coverage  rate  of  .99. 
Ideally,  coverages  of  .95  for  all  n  and  dispersion  parameters  would  be 
desirable  to  prevent  overly  conservative  estimates  of  basis  values. 


TABLE  III 


Confidence  Coefficient  (X),  N(M,<f  ),  ^"50, 


LINEAR 

ORDER 

1 

STATISTIC  METHOO 

HANSON-KOOPMANS  METHOO 

a 

n- 10 

n«  1 5 

n«  5 

n»  1 0 

n-15 

n-30 

2 

.60 

.99 

.99 

.97 

.95 

.99 

6 

.76 

.98 

.99 

.97 

.94 

.99 

.78 

.94 

.99 

.98  , 

.94 

.99 

14 

.78 

.92 

.99 

.98 

.94 

.99 

30 

.80 

.90 

.99 

.97 

.94 

.99 

1 

I 


as 
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TABLE  IV 

Confidence  Coefficient  (X),  W (0,$),  £>50 


LINEAR  ORDER 
STATISTIC  METHOD 


HANSON-KOOPMANS  METHOD 

n*5 

n-10 

n-15 

n-30 

.99 

.98 

.96 

.99 

.99 

.97 

.95 

.99 

.98 

.97 

.94 

.99 

.98 

.97 

.94 

.99 

.98 

.97 

.94 

.99 

CONCLUSIONS 


The  Hanson-Koopmans  nonparametrl c  small  sample  tolerance  limit 
model  provided  the  most  desirable  solution  to  obtaining  B-basIs  values. 
The  authors  method,  LOSM,  provided  an  acceptable  method  1  f  n  _>  15.  For 
small  sample  sizes,  results  were  excessively  non-conservative. 

Methods  Involving  factors  of  the  first  order  statistic  resulted  In 
overly  conservative  or  non-conservative  B-value  estimates,  depending  on 
the  dispersion  of  data  and  the  sample  size.  The  extended  quantile  sign 
test  failed  to  provide  either  3  computationally  simple  solution  to 
obtaining  basis  values,  or  a  factor  associated  with  first  ordered  value 
In  calculated  B-basIs  value.  The  need  for  repeated  application  of 
non-linear  regression  to  each  sample,  when  factors  were  not  available, 
reduces  Its  value  as  an  engineer's  statistical  method.  The 
conventional  quantile  sign  test  was  not  applicable  for  n  <  29,  although 
It  Is  an  acceptable  procedure  otherwise. 
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A  SECOND  LOOK  AT  THE 
PERVERSITY  OF  MISSING  POINTS  IN  THE  2* 

Carl  T.  Russell 

US  Army  Operational  Test  and  Evaluation  Agency 
Falls  Ch\irch,  Virginia 


ABSTRACT.  At  the  1982  Design  of  Experiment*  Conference,  the 
author  presented  a  Clinical  Paper  entitled  The  Perversity  of  Missing 
Points  in  the  2*  Design.  That  paper  tried  to  characterize  what 
points  could  be  deleted  from  the  2*  design  without  losing  the 
resolution  V  property  (that  is,  main  effects  and  2-factor  interactions 
are  estimable).  That  paper  used  brute  force  (computer  plus  sweat) 
methods  to  investigate  numerous  special  cases  and  formulate  some 
promising  conjectures,  but  no  general  conclusions  were  reached 
G.E.P.  Box  was  the  primary  discussant  on  the  paper,  and  he 
suggested  using  a  matrix  trick  to  reduce  the  dimensionality  of  the 
problem  from  eleven  to  five.  The  current  paper  shows  how  notation 
from  group  theory  and  graph  theory  can  be  used  to  exploit  Box’s 
suggestion  to  prove  the  conjectures  of  the  original  paper.  In 
particular,  even  if  five  of  the  sixteen  points  are  deleted  at  random 
from  a  24  design,  probability  is  almost  0.7  that  the  resulting 
design  is  still  resolution  V—that  is,  all  eleven  parameters  are 
estimable  from  the  remaining  eleven  data  points.  Unfortunately,  the 
method  used  does  not  appear  to  generalize  to  larger  designs  of 
greater  interest. 

I.  INTRODUCTION.  Execution  of  a  military  field  test  seldom 
proceeds  exactly  as  planned,  and  rather  large  amounts  of  missing 
data  are  common.  In  fact,  two  other  papers  given  at  this  Design  of 
Experiments  Conference  dealt  with  aspects  of  the  problem  Winner 
and  Smith  described  a  situation  where  a  large  portion  of  the  planned 
experiment  captured  no  data,  Bryson  and  Russell  presented  a 
method  for  adjusting  attrition  estimates  from  “Real  Time  Casualty 
Assessment"  based  on  changed  estimates  of  kill  probabilities  which 
were  'missing"  when  the  real  time  casualty  assessments  were 
made  In  1982,  I  approached  the  problem  from  a  different  angle  by 
studying  what  happens  when  points  are  arbitrarily  deleted  from  a 
factorial  design  (Russell,  1983a).  This  study  was  motivated  by  the 
observation  that  most  field  tests  of  military  materiel  are  designed  in 
a  factorial  framework  and  conducted  in  blocks  of  time  and/or  space. 
The  blocks  could  in  theory  be  constructed  from  appropriately  chosen 
fractional  factorials  to  reduce  the  potential  bias  due  to  confounding 
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which  is  common  in  much  traditional  field  test  design  (see 
Russell— 1981,  1982,  1983b,  and  Section  V  of  this  paper).  Before  such 
a  design  approach  can  be  prudently  implemented  in  expensive  field 
tests,  however,  an  understanding  of  its  robustness  to  substantial 
data  loss  is  needed 

In  the  summer  of  1982,  1  began  this  study  by  looking  at  the 
simplest  Interesting  factorial  design,  the  2*  design.  The  study  asked 
two  questions 

(1)  Characterization  problem— what  points  can  be  deleted 
from  the  2*  design  without  losing  estimability  of  the 
mean,  main  effects  and  2-factor  interactions  (resolution  V 
property)? 

(2)  Structural  problem— when  the  remaining  design  is 
resolution  V,  what  is  the  structure  of  the  least  squares 
estimates  obtained? 

The  problem  turned  out  to  be  much  harder  than  1  anticipated,  and 
it  grew  into  a  Clinical  Paper  presented  at  the  1982  Design  of 
Experiments  Conference  (Russell,  1983a)  That  paper  used  brute  force 
methods  to  beat  a  portion  of  the  structural  problem  to  death  using  a 
computer  and  to  make  some  promising  conjectures  for  the 
characterization  problem.  GE.P  Box  was  the  primary  discussant, 
and  he  made  a  suggestion  for  the  characterization  problem  which 
enabled  me  to  prove  the  original  conjectures  and  quantify  the 
likelihood  that  random  deletions  of  points  from  the  24  design  would 
destroy  the  resolution  V  property.  This  paper  presents  the  results 
growing  out  of  Professor  Box’s  suggestion.  Unfortunately,  the 
methods  used  do  not  appear  to  generalize  to  larger  designs  of  greater 
interest 

Why  write  this  paper  if  the  results  essentially  represent  a  dead 
end?  First,  it  closes  the  loop  from  a  Clinical  Session  where  as  an 
Army  statistician.  1  received  useful  assistance  on  an  Important 
problem  which  enabled  me  to  proceed  further  than  I  otherwise  could 
have.  Second,  even  though  the  methods  of  this  paper  do  not  appear 
to  generalize,  they  are  mathematically  appealing,  they  took  me  a 
good  part  of  the  1982-83  winter  to  derive,  and  they  enable 
quantitative  results  which  make  me  more  optimistic  that  some 
fractional  factorial  blocking  approaches  may  be  quite  robust  against 
random  data  loss  Third,  this  paper  re-emphasizes  an  important 
problem  which  needs  and  deserves  further  work  by  statisticians 


II.  PRELIMINARIES.  A  standard  notation  for  the  four  factors 
and  sixteen  points  in  the  a4  design  is  the  following. 
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The  full  model  can  be  written  (in  slightly  unusual  order)  as 

YIJkI  s  P  4  *1  ♦  $ J  ♦  #k  ♦  *1  (2) 

4  (<x3)|j  4  odOifc  4  («8)n  4  (Mjk  4  (£&)j|  ♦ 

♦  (0tf$)jkl  4  (o<2f5)||(j  ♦  («$S)|ji  +  («^y)ijk 

4  ♦  E|Jki, 

where  the  subscripts  can  be  removed  by  the  usual  side  conditions: 

o<o4o<is0,  that  is  (3) 

•  • 

So4S|=0,  that  is  S|=i8 

(«9)o04(«^)oiaM)l04M)l1BOi  WMt  is  M)|j=±(o<3) 
(^8)oo4(^8)fli=(^8)io4(^8)||=0,  that  Is  (&&)itt:i(V&) 
(o(^9&)oooo4(o(PVS)oooi  =  ...  =  Q,  Is,  («fl2f&)|jki=t(o<$#S). 


In  matrix  form,  these  norm*!  tqu  i  Lions  arc 


where  the  design  matrix,  X,  hat  rows  corresponding  to  design  points 
and  columns  corresponding  to  parameters.  The  reduced  main  effects 
and  2-factor  interactions  model  is: 

Y»Jkl  *  J>  ♦  «l  ♦  *J  ♦  ♦  *1 

♦  (*$)ij  4  («#)lk  4  («8)ii  4  (JW)jk  4  (Mji  4  (#&)ki  ♦  Eij^i, 

which  reduces  the  number  of  columns  of  X  from  16  to  11.  Figure  1 
shows  the  design  matrix  for  the  full  model,  (2),  partitioned  to 
accentuate  the  missing  parameters  in  model  (S),  and  assuming  the 
unsubscriptcd  parameters  from  (3)  are  used.  Deleting  points  from 
the  design  corresponds  to  deleting  rows  from  X.  The  normal 
equations,  (4),  have  a  least  squares  solution  iff  X'-X  is  nonsingular, 
in  which  case  the  solution  is 

|  =  (X*  X)~»-X'-Y.  (8) 


Since  there  arc  11  parameters  in  the  reduced  model,  (5),  solving  the 
characterization  problem  via  (6)  for  5  or  less  missing  points  requires 
checking  a  matrix  of  dimensions  at  least  llxll  for  singularity. 

To  reduce  the  dimensionality  of  the  characterization  problem, 
Box  suggested  partitioning  the  design  matrix,  X,  by  making  the  m 
missing  points  correspond  to  the  last  m  rows  and  the  p  pyrometers 
of  interest  correspond  to  the  first  p  columns: 
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By  assuming  orthogonality  of  X  and  expanding  the  orthorgonality 
relationships  X'-X  =  nlB  =XX’  in  matrix  form,  Box  proved  the 
following  lemma  via  an  eigenvalue  argument. 

Lean.  X|'-Xf  is  nonsingular  iff  X  VX*  is  nonsingular. 
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Figure  1.  The  Design  Matrix,  X,  for  the  2*  Design, 
Partitioned  to  Show  Present  and  Missing  Parameters  in  the 
Main  Effects  end  2-Factor  Interactions  Model. 
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In  the  cam  of  the  2*  design,  the  lemma  reduces  the  dimensionality 
of  the  characterization  problem  from  eleven  or  more  to  five  or  lees. 
For  example,  with  five  missing  points  (m-5)  and  p-11  (the  number 
of  parameters  in  (5)),  the  S-dimensional  square  matrix  X^-X*  could 
be  examined  for  singularity  instead  of  the  11 -dimensional  square 
matrix  X|*X|,  and  the  problem  gets  easier  with  less  than  five 
missing  points. 

The  rows  of  the  full  2*  design  corresponding  to  3-  and  4-factor 
interactions  (that  is,  the  rows  of  the  missing  parameter  submatrix 
in  Figure  l)  represent  the  vertices  of  the  5-dimcnslonal  hypercube 
which  have  an  even  number  of  minus  signs.  By  Lemma  1,  the 
characterization  problem  is  reduced  to  characterizing  the  subsets  of 
these  even  vertices  which  are  linearly  dependent.  All  the  vertices  of 
the  5-dlmcnslonal  hypercube  form  a  group  Q§  under  coordinatewise 
multiplication,  and  the  even  vertices  form  a  subgroup  Eg,  The 
subgroup  Eg  is  isomorphic  to  the  quotient  group  obtained  by 
identifying  opposite  vertices  (ie,  those  with  all  signs  switched)  in 
By  mncmonically  relabeling  the  columns  of  missing  parameters  in 
Figure  1, 

(£#S)  -+  A,  since  «  is  missing  rrom  (£#&), 

(o<ffS)  0,  since  t  is  missing  rrom  (orfft), 

(«£S)  -*  C,  since  Z  is  missing  rrom  (o<0&), 

(<x£tf)  -» D,  since  S  is  missing  rrom  (wfiX), 

(<xfiX&)  -»  E,  since  e  is  missing  rrom  (*|WS), 

and  letting  a  letter  A.  B.  C,  D  or  E  appear  in  a  vertex  label  irf  the 
sign  of  the  respective  coordinate  is  positive,  the  quotient  group 
becomes  sg»a,/ll.  ABCDE)  Figure  2  gives  the  new  labeling  of  the 

missing  parameter  submatrix  from  Figure  1  in  terms  of  A,  B,  C,  D, 

and  E  together  with  the  respective  cosets.  The  ±l's  of  Figure  i  have 

been  replaced  by  simply  +'s  and  -'s  in  Figure  2,  and  one-  or 
two-letter  design  point  labels  are  underlined  to  Indicate  that  they 
will  be  used  as  standard  coset  labels.  (Group  theory  is  used  here  only 
for  limited  notational  convenience:  not  much  algebra  is  exploited. 
Likewise,  the  graph  theory  Introduced  in  the  next  section  is  used 
simply  as  a  bookkeeping  tool.  Better  exploitation  of  these 
mathematical  objects  might  lead  to  more  general  results.) 


Vi* 

:::: 

•,y. 


AV 

y»  i 1 


o  ^ V  V  V  V.V  VhV.  A'*'. 


•  V, 

#  4 .»  V 

•*  S  V 

f4 


OLD 

POINT 

LABELS'1’ 

<»»> 

A 

-PARAfll 

(«*S) 

B 

ETERU 

(«*$) 

C 

ABELS'2’ - 

(«£?)  («**&) 

D  E 

NEW  POINT 
LABELS(3) 
(COSETS'*’) 

(1) 

- 

- 

- 

- 

4 

£  *  ABCD 

a 

- 

♦ 

4 

4 

- 

BCD  *  AE 

b 

♦ 

- 

4 

4 

- 

ACD  *  BE 

ad 

♦ 

♦ 

- 

- 

4 

ABE  *  CD, 

c 

♦ 

♦ 

tm 

4 

- 

ABD  *  CL 

ac 

♦ 

- 

4 

- 

4 

ACE  *  BD. 

be 

- 

4 

4 

•ft 

4 

BCE  *  AD. 

abc 

- 

- 

- 

4 

- 

D.  *  ABCE 

d 

4 

4 

4 

- 

- 

ABC  *  D£ 

ad 

4 

- 

- 

4 

4 

ADE  *  BC. 

bd 

- 

4 

- 

4 

4 

BDE  *  AC 

abd 

- 

- 

4 

- 

- 

L  *  ABDE 

cd 

- 

- 

4 

4 

4 

CDE  *  AB 

acd 

- 

4 

- 

tm 

- 

B  *  ACDE 

bed 

♦ 

- 

- 

- 

- 

A  *  BCDE 

abed 

♦ 

4 

4 

4 

4 

ABCDE*  1 

Tabulated  Symbols  Are  the  Signs  of  Entries  In  the 
Missing  Perimeters  Submstrix  of  F  igure  I. 

M)  The  usual  way  of  labeling  design  points,  or  treatments,  in 
2n  experimental  designs— see  (I)  in  text. 

(2)  Greek  letters  in  parentheses  are  the  old  parameter  labels, 
outlined  capital  letters  are  the  new  parameter  labels. 

(3)  ^  outlined  letter  appears  In  the  new  treatment  label  (first 
column)  Iff  appears  In  the  respective  column. 

H)  Cosets  obtained  by  Identifying  opposite  vertices  of  the  five 
dimensional  hypercube.  Underlined  labels  (one-  or  two- 
letter  combinations)  are  used  as  standard  coset  labels. 

Figure  2.  The  Matrix  of  Missing  Parameters, 
Showing  New  Labels. 


III.  MAIN  RESULT.  The  geometric  underpinnings  of  the 
reformulated  problem  suggest  a  geometric  approach  to  Its  solution. 
The  problem  Is  to  characterize  linearly  dependent  subsets  of  the 
vertices  of  a  hypercube.  Clearly,  opposite  vertices  of  a  hypercube 
are  pairwise  linearly  dependent  (hence  interchangeable  in  linearly 
dependent  subsets),  so  the  identification  of  opposite  vertices  in  cosets 
simply  gets  rid  of  a  trivial  nuisance.  Once  opposite  vertices  are 
identified  in  the  5-dlmensional  hypercube,  defining  an  edge  between 
two  new  vertices  If  there  was  an  edge  between  pairs  of  old  vertices 
is  natural.  In  fact,  it  is  useful  to  define  a  quotient  graph  on  the  new 
vertices  as  follows. 

Definition.  The  quotient  graph.  &5  of  €§  is  the  graph  whose 
vertices,  V(Os).  arc  the  elements  of  the  quotient  group 
Eg  and  whose  edges,  £(&$),  connect  any  pair  of  vertices 
in  V(d5)  whose  cosets  were  connected  in  Sg 

In  the  notation  of  Figure  2,  V(&s)  consists  of  the  Identity,  single 
letters,  and  double  letters,  and  £(&$)  consists  of  edges  connecting: 

•  I  with  each  of  the  single  letters  A,  B,  C.  D.  and  E. 


•  Each  of  A,  B,  C,  D,  and  E  with  I  and  with  each  double 
letter  containing  it. 

Example:  AB 


AE 
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•  Each  double  letter  with  all  single  letters  contained  in  it 
and  with  eaoh  other  double  letter  disjoint  from  it. 

Rumple: 

CE 


The  usual  definition  of  a  subgraph  is: 


A  subgraph  S  of  a  graph  ft  defined  by  the 
vertices  V(8)  Is  the  graph  such  that  V(S)  £  V(fl)  and 
E(8)  consists  of  the  edges  of  ft  connecting  vertices  of  S. 

With  this  definition,  the  following  theorem  will  be  proved. 

Tbooroa  (Mala  Result).  Let  8  be  a  subgraph  of  ftg  defined  by  V(S) 

where  V(8)  has  5  elements.  Then  the  elements  of  Kg 

corresponding  to  V(S)  are  linearly  dependent  iff  8  contains  a 
subgraph  of  one  of  the  following  three  forms. 


(2«) 


(3.) 


<->n 


Proof  of  Sufficiency  it  is  easy  to  show  that  each  of  the  three  types  of 
subgraph  give  linear  dependence,  All  are  closely  related  to  P.W.M. 
John's  three-quarter  replicates  (John  1971,  paaes  161-163),  which 
formed  the  basis  for  much  of  my  earlier  paper  (see  especially  Table 
1,  page  604,  of  Russell  1983a,  denoted  by  “Tl"  below). 

The  2-edge  type,  (2e)  —  with  4  vertices  —  corresponds  to  deleting 
the  quarter  replicates  of  cases  2  and  6  in  T1  (defining  contrasts 
similar  to  l»D»BOBCD  and  l»AB”CD»ABCD). 

Example 


Case  2. 
I=D=BC=BCD 

abed- 1  ♦ . ft  A=bcd 


ad=  BC  #■"""■#  DE  =d 


Case  5. 

I=AB=CD=ABCD 
abcd=l  +■* . ft  E  =(1) 

cd=AB  CD=ab 
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The  4~cdgc  type,  (4e)  —  with  4  vertices  —  corresponds  to  deleting 
the  quarter  replicates  of  cases  1  and  3  in  T1  (defining  contrasts 
similar  to  I-C-D-CD  and  I-BOBD-CD). 

K“mpl€  Cm  1.  Cm  3. 

(eCbD&CD  i*bc«bd*cd 


abed- 1 
acd=B 


A=bcd 

AB  cd 


abed5 1 
0)=E 


A=bcd 

AE=a 


The  3-edge  type,  (3e)  —  with  *  vertices  —  corresponds  to  deleting 
an  appropriate  additional  point  from  the  quarter  replicates  of 
cases  1  and  3  in  T1  (defining  contrasts  similar  to  I*I>-ABO-ABCD 
and  I-ABOABD-CD). 

K*amPl*  Case  4.  Case  6. 

I«D*-ABC*-ABCD  I*ABC=ABD=CD 


These  correspondences  between  the  graphs  and  three-quarter 
replicates  show  that  the  main  result  actually  establishes  the 
conjecture  on  page  522  of  Russell,  1983a. 

Proof  ef  N— sltv,  To  prove  necessity  assume  without  loss  of 
generality  that  I  la  a  vertex  of  the  subgraph  and  that  I  has  the 
maximum  number  of  incident  vertices.  The  proof  considers 
cases  based  on  the  number  of  vertices  at  I,  and  as  a  byproduct 

used  in  later  extensions  of  the  theorem,  counts  the  number  of 
possible  graphs  of  each  type  which  result  in  linear  dependence. 


ns  O-Edp.  If  there  are  no  edges  incident  at  I,  then  the  only 
subgraph  has  independent  vertices. 
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_  1-Up.  if  there  it  one  edte  incident  at  I,  then  there  are  two 
types  of  subgraph,  one  of  which  has  type  (2e)  dependent 
vertices  which  can  occur  in  240  wr“ 


•BC 

•BO 

•BC 


rn  Type  (2e) 
(240  ways) 


Cms  2-Edp.  if  there  are  two  edges  incident  at  I,  then  there  are 

six  conceptual  types  of  subgraph,  two  of  which  ore  not 
realizable,  and  one  oi  which  has  type  (2e)  dependent  vertices 
which  can  occur  in  480  ways. 


<A  BCD 

... 


NOT  . 
POSSIBLE  1 


Type  (2e) 
[480  ways) 


ass  3-Edp.  If  there  ere  three  edges  incident  at  I,  then  there  ere 

three  conccptuel  types  of  subgraph,  one  hes  type  (3e) 
dependent  vertices  which  can  occur  in  160  ways.  The  other 
has  type  (4e)  dependent  vertices  which  can  occur  in  480  ways. 


Typ*  (s«) 
(160  way*) 


Typ*  M«) 
(480  way*) 


2 _ 


4-Edgs.  If  there  are  four  edges  incident  at  I,  then  the  only 
subgraph  has  independent  vertices. 


IV.  CONSTQUENCKS  OF  MAIN  RKSULT.  It  follows 
immediately  form  the  main  result  that  cases  (2e)  and  (4e)  represent 
the  only  ways  four  points  can  be  deleted  from  the  24  design  and  fall 
to  leave  a  design  of  resolution  V  and  that  if  less  than  four  points  are 
deleted  then  the  remaining  design  is  of  resolution  V. 


Cerellsrv  .  Let  8  be  a  subgraph  of  fig  defined  by  V(8)  where  V(S) 

has  4  elements.  Then  the  elements  of  E§  corresponding  to  V(8) 

are  linearly  dependent  iff  8  contains  a  subgraph  of  one  of  the 
following  two  forms. 


(2«) 


<4«) 


□ 


Crallsrv.  If  less  than  4  points  arc  deleted  at  random  from  the  24 
design,  then  the  remaining  design  is  of  resolution  V 
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Totalling  numbers  of  linearly  dependent  subgraphs  identified  in  the 
proof  of  the  main  result  (with  a  recount  in  the  4-point  case)  gives 
the  following  rather  surprising  quantitative  results.  They  establish 
the  conjecture  that  most  designs  obtained  by  deleting  four  or  five 
points  at  random  from  the  2*  design  are  of  resolution  V  (Russell, 
1983a,  page  622). 


item  tea.  If  6  points  arc  deleted  at  random  from  the  2* 
design,  then  the  probability  that  the  resolution  V  property 
is  lost  is 
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i  teas  ton,  if  4  points  are  deleted  at  random  from  the  2* 
design,  then  the  probability  that  the  resolution  V  property 
is  lost  is 
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V.  GENERALIZATION  OP  MAIN  RESULT.  The  methods  of  this 
paper  do  not  appear  to  generalize  in  a  useful  manner  to  2n  designs 
with  n>4.  They  might  extend  to  the  case  where  Just  n-  and 
(n-1) -factor  interactions  are  ignored  in  the  2n  design,  but  that 
situation  is  of  little  interest.  Reduction  of  dimensionality  is  already  a 
big  problem  in  the  resolution  V  case  with  the  2s  design:  there  are  as 
many  excluded  parameters  as  present  parameters  (16  parameters), 
and  the  16-dimenslonal  hypercube  looks  very  complicated.  The 
present  methods  would  require  looking  at  32  of  the  66,536  vertices  of 
the  16-dlmensional  hypercube  to  study  loss  of  the  resolution  V 
property.  The  case  which  originally  interested  me  in  this  problem 
was  even  larger,  namciy,  a  resolution  V  quarter  replicate  of  the  2® 
design.  1  felt  and  still  feel  that  such  a  design  should  be  relatively 
insensitive  to  data  loss,  but  the  methods  of  this  paper  don't  seem  to 
provide  a  good  way  to  look  at  missing  points  in  such  designs.  It  is 
possible  that  extending  the  geometric,  graph  theoretic  approach 
might  be  easier  than  it  appears.  Or  a  usable  genera)  characterization 
of  linearly  dependent  vertices  In  an  n-dimensionai  hypercube  may 
be  known.  Alternatively,  there  might  be  a  purely  algebraic 
approach  to  the  problem  which  would  yield  general  results 

In  any  case,  the  general  problem  still  needs  and  deserves  more 
work.  As  an  exercise  to  see  how  far  one  might  push  2n”k  fractional 
factorial  designs  in  a  field  test  framework,  I  designed  in  1983  a 
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hypothetical  operational  test  for  a  communications  Jammer  using 
formal  experimental  design  methods  (Russell,  1983b).  The  resulting 
design  examined  32  factors,  each  nominally  at  2  levels,  in  such  a 
way  that  62  effects  (including  carefully  chosen  interactions)  would 
be  estimable.  The  design  had  512*  29  points  and  in  fact  was  a  2~as  " 
1/8,388,608  fraction  of  a  2s3  design  (4,294,976,296  points)  run  in  64 
blocks  of  size  8.  The  design  would  have  required  8  days  to  run  and 
could  have  been  easily  extended  in  8-day  increments  to  a  "full 
factorial"  test  219  *  254,288  times  as  long  and  lasting  over  11,000 
years.  If  one  were  really  to  try  to  run  such  a  design,  however,  the 
risk  associated  with  missing  points  shouldn’t  be  too  serious  because 
there  arc  many  more  points  than  parameters,  and  the  results  of 
this  paper  concerning  the  2*  design  suggest  that  the  risk  could  be 
quite  small.  But  even  at  a  cost  of  only  $1,000  per  data  point, 
actually  running  such  a  design  would  cost  more  than  a  half  million 
dollars.  Large  field  tests  cost  many  times  more.  The  statistician’s 
risk  in  proposing  even  substantially  more  modest  designs  (such  as 
that  In  Russell,  1982)  would  be  much  less  if  there  were  better 
theoretical  understanding  of  robustness  to  data  loss. 
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A  METHOD  FOR  THE  STATISTICAL  ANALYSIS  OF  THE 
STRESS-STRAIN  PROPERTIES  OF  EARTH  MATERIALS 

G.  Y.  Baladi  and  B.  Rohanl 
Georoechanlcs  Division,  Structures  Laboratory 
U.S.  Army  Engineer  Waterways  Experiment  Station 
Vicksburg,  Mississippi 


ABSTRACT.  Stress-strain  properties  of  earth  materials  under  various  test 
boundary  conditions,  suoh  as  uniaxial  strain,  hydrostatic  compression,  and 
triaxlal  shear,  are  required  for  conducting  a  two-dimensional  (2D)  analysis  of 
explosive-induced  ground  shock.  Such  properties  are  random  and  often  contain 
artificial  instrumentation-induced  noise.  The  randomness  is  primarily  due  to 
spatial  variation  of  the  soil  properties,  biases  associated  with  field  samp¬ 
ling  disturbance,  and  errors  in  laboratory  testing  equipment  and  procedures 
and  must  be  accounted  for  in  ground  shock  analysis.  This  necessitates  the  use 
of  2D  probabilistic  wave  propagation  computer  codes  as  opposed  to  determinis¬ 
tic  procedures.  To  use  the  stress-strain  properties  for  such  probabilistic 
oaloulatlons ,  one  must  first  eliminate  (or  reduce)  the  spurious  instrumenta¬ 
tion-induced  noise  in  the  "raw"  data  and  then  statistically  quantify  the 
"smoothed"  data.  The  outcome  of  the  statistical  quantification  is  the  repre¬ 
sentation  of  the  stress-strain  data  in  terms  of  the  expected  response,  its 
variance,  and  the  associated  correlation  coefficients.  The  paper  discusses  a 
methodology  for  smoothing  the  raw  stress-strain  data  and  the  subsequent  sta¬ 
tistical  analysis.  Application  of  the  methodology  is  demonstrated  for  Nellis 
Baseline  sand. 


I.  INTRODUCTION!  The  ground  shock  calculation  techniques  currently  used 
to  predict  the  states  of  stress  and  ground  motions  Induced  in  earth  masses  by 
explosive  detonations  are  deterministic  tools.  That  is,  the  input  parameters 
(media  constitutive  properties  and  surface  airblast  loadings)  are  specified  as 
single-valued  deterministic  quantities  or  functions.  In  actuality,  however, 
both  the  constitutive  properties  of  earth  materials  and  the  characteristics  of 
the  airblast  pulses  are  dispersed  random  variables.  The  randomness  of  these 
input  variables  indicates  that  resulting  stresses  and  ground  motions  are  also 
random  variables.  Therefore,  ground  shock  problems  should  be  analyzed 
probabilistically.  The  purpose  of  the  probabilistic  analysis  Is  to  obtain  a 
quantitative  understanding  of  how  the  variabilities  or  uncertainties  in  the 
input  parameters  for  a  particular  problem  affect  the  dispersion  of  the  output 
quantities  or  parameters.  To  use  the  stress-strai n  properties  for  such 
probabilistic  analysis,  one  must  first  eliminate  (or  reduce)  the  spurious 
instrumentation-induced  noise  in  the  "raw"  data  and  then  statistically 
quantify  the  "smoothed"  data.  The  outcome  of  the  statistical  quantification 
is  the  representation  of  the  stress-strai n  data  in  terms  of  the  expected 
response,  its  variance  and  the  associated  correlation  coefficients. 


The  paoer  presents  the  development  or  a  computerized  methodology  for 
statistically  analyzing  a  set  of  random  stress-strain  data.  This  includes 
(1)  a  procedure  for  eliminating  the  spurious  noise  in  the  raw  data  due  to 
Instrumentation  without  affecting  the  actual  physical  response  of  the  material 
and  (2)  a  procedure  for  statistically  analyzing  the  random  behavior  of  the 
"smoothed"  data.  The  outcome  of  these  procedures  is  a  representation  of  the 
stress-strain  data  in  terms  of  the  expected  response,  its  variance,  and  the 
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correlation  coefficients.  Application  of  the  methodology  is  demonstrated  for 

Nellis  Baseline  sand. 


II.  DATA  SMOOTHING  PROCEDURE.  The  laboratory  stress-strain  data  often 
oontain  artificial  noise  due  to  Instrumentation  which  must  be  filtered  out 
before  the  data  oan  be  used.  Therefore,  a  technique  to  smooth  the  measured 
data  without  ohanging  the  aotual  physioal  response  of  the  material  is  needed. 
Suoh  a  procedure  has  been  developed  by  Baladi  and  Barnes  (Reference  1)  and  is 
based  on  the  oonoept  of  a  marohing  mean  square.  If  the  measured  value  of  the 
1th  data  point  is  expressed  as  ym(Xi)  ,  the  corresponding  smoothed 
response  y8(Xj)  oan  be  expressed  as 
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where  n  -  1  is  the  window  over  whioh  the  marohing  mean  square  is  taken 


(i.e.,  n" 0-  1  is  the  number  of  data  points  to  the  left  and  to  the  right  of 


the  i^1 
than  3* 


data).  Note  that  n  has  to  be  an  odd  number  equal  tc  or  greater 


Equation  1  was  applied  to  smooth  the  raw  data  from  uniaxial  strain 
(Figure  1)  and  triaxial  oompression  (Figure  2)  tests  for  Nellie  Baseline 
sand.  As  shown  in  these  figures,  the  results  of  these  tests  are  quite  noisy. 
The  value  of  n  used  to  smooth  these  data  was  5.  Several  passes  had  to  be 
made  in  order  to  obtain  a  satisfactory  set  of  smoothed  stress-strain  rela¬ 
tions.  The  final  set  is  shown  in  Figures  3  and  4,  and  it  is  noted  that  the 
overall  charaoter  of  the  stress-strain  relation  is  not  altered  as  a  result  of 
the  smoothing  prooess  (for  example,  compare  Figures  1  and  3). 


III.  STATISTICAL  ANALYSIS  OF  SMOOTHED  STRESS-STRAIN  DATA.  In  this  section,  a 
generic  procedure  is  outlined  for  statistical  analysis  of  nonlinear  stress- 
strain  data.  Consider  a  set  of  curves  relating  the  random  variables  y  and 
x  (Figure  5).  The  objeotive  of  the  statistical  analysis  is  to  determine  the 
mean  curve  with  its  one-standard-deviation  bounds  relating  the  random  vari¬ 
ables  y  and  x  .  This  oan  be  accomplished  by  applying  standard  statistical 
procedures  to  the  slope  of  the  random  ourves  in  Figure  5.  The  following  steps 
should  be  taken  to  conduot  the  statistical  analysis: 


(1)  For  a  given  set  of  n  ourves,  divide  the  x-axis  into  y  number  of 
equal  increments  Ax  (Figure  5). 


(2)  For  the  1th  inorement,  determine  the  slope  of  the  Jth  curve 
denoted  by  n.. 
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(3)  Determine  the  expected  value  and  the  standard  deviation  of  the  slope 
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(it)  Next,  oompute  the  mean  and  the  standard  deviation  of  y  .  To  aooom- 
pliah  this,  the  oovarlance  and  the  correlation  ooeffloient  matrices  of  the 
elopes  oov(flk,nm)  and  pkn)  ,  respectively,  can  be  first  calculated  from  the 
following  relations! 

“"t VV  -  EC(Qk  -  V'0.  -  5„n  ■  rH  (nkj  -  V'Vj  -  “■>  (5) 


increment  for  all  the  ourves  aooordlng  to  the  following  expres- 
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Finally,  the  mean  value  and  standard  deviation  of  y  at  the  itfl  Incre¬ 
ment  become 
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Equations  8  and  9  were  applied  to  the  smoothed  stress-strain  data  for 
Nellis  Baseline  sand  presented  in  Figures  3  and  4.  The  resulting  ourves  are 
shown  in  Figures  6  and  7.  Eaoh  figure  oontains  the  mean  response  with  its 
one-standard-deviation  bounds. 
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Figure  2 


Triaxial  compression  test  results  (o  -  20  MPa)  for  Nellis 
Baseline  sand.  r 
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Figure  4.  Smoothed  triaxial  compression  test  results  (o  -  20  MPa)  for 
Nellis  Baseline  sand.  r 
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relating  the  random  variables  y  and  x  . 


Figure  5.  General  curves 
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Figure  6.  Uniaxial  strain  test  results  for  Nellis  Baseline  sand;  mean 
response  with  its  one-standard-deviation  bounds. 
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Figure  7. 
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Triaxial  compression  test  results  (o  -  20  MPa)  for  Nellis 
Baseline  sand;  mean  response  with  Its  one-standard-deviation 
bounds. 
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ON  TIE  FOLLOWING  ARTICAL 

A  Hsthod  for  the  Statistical  Analysis  of  the  Stress-Strain 
Properties  of  Earth  Hitarials  by 
O.Y.  Baled! 
and  B.  Rohani . 

U.5.  Army  Engineer  US tar way*  Experiment  Station 

W.T.  Federer*  Before  any  ooanants  in  dapth  oould  possibly  ba  made, 
a  oopy  of  tha  pap ar  and  di  touts  ions  with  the 
experimenter,  would  ba  required .  From  listening  to 
tha  lecture  and  pondering  on  tha  topic,  it  would 
appear  that  an  explosion  oreates  a  spherioal  shook 
waue  affect  with  only  tha  radius  of  tha  sphara  baing 
a  random  variable.  Tha  above  would  follow  far  one 
radium  such  at  air  or  water.  However,  whan  a  aaoond 
radium  is  encountered  tha  radius  of  a  sphara 
changes.  That  It,  tha  sphara  for  air  is  not  tha 
sara  as  for  water. 

Tha  real  problem  had  to  do  with  an  air  bint's  effeot 
on  underground  structures.  Tha  hataroganaity  of  tha 
soil  raant  that  several  radia  ware  baing  encountered . 
This  adds  considerably  to  the  complexity  of  the 
problem.  It  would  appear  that  concentrating  on  the 
radii  and  confidence  intervals  for  radii  in  various 
radia  would  sinplify  the  problem.  If  the  bursts  were 
directional,  then  other  regular  figures  such  as  an 
ellipsoid  would  need  to  be  oonsidered.  The  shape  of 
the  burst  would  determine  vtuoh  measurement  should  be 
used.  Hence ,  more  emphasis  on  the  shape  of  bursts  in 
various  media  should  be  mde.  The  statistical 
problem  is  usually  simplified  vtien  the  model 
structure  is  completely  specified.  Than,  for  a  given 
number  of  radia  (a.g.  sand,  clay,  loom,  rooks,  eto.) 
in  a  given  proportion,  oonfidenoe  intervals  oould  be 
constructed. 


SOME  APPLICATIONS  OP  B4YESIAN  IMAGE  ANALYSIS 


Stuart  Gtman  1 

Divition  0/  Applied  MatKtmatici 
Brown  l/nivertity 

Provide  net,  Rhodt  It  land  0B9IB/USA 


The  various  teaks  of  image  proceaaing,  such  as  removing  blur,  finding  boundaries,  and 
detecting  objects,  have  traditionally  been  approached  on  a  case-by-case  basis.  The  result 
is  a  spectrum  of  ad  hoc  techniques.  The  author  and  his  colleagues  are  trying  to  develop  a 
coherent  mathematical  foundation  that  will  support  a  variety  of  these  tasks,  ranging  from 
problems  in  “low  level  vision",  such  as  noise  removal,  to  problems  in  “high  level  vision" ,  such 
as  scene  segmentation  and  analysis.  The  framework  is  Bayttian:  probabilistic  image  models 
are  constructed.  These  are  probability  distributions  jointly  on  picture  element  grey-levels, 
locations  of  edge  elements,  placements  and  types  of  textures,  and  other  image  attributes  as 
may  be  appropriate  in  a  particular  application.  Markov  random  fields  (equivalently,  Gibbs 
distributions)  are  especially  apt  and  convenient  for  representing  real-world  prior  knowledge 
about  these  attributes.  The  end  product  of  the  formulation  is  a  potttrior  diitribution,  on  the 
uncorrupted  grey-levels,  locations  of  edges,  texture  labels,  and  so-on,  given  an  observed  and 
possibly  degraded  picture.  Image  restoration  and  analysis  amount  to  the  identification  of 
the  mode  (or  sometimes  the  mean)  of  this  posterior  distribution. 

The  approach  is  implemented  in  four  steps.  Each  step  will  be  discussed  in  detail, 
highlighting  the  important  theoretical  issues.  These  steps  are: 

1.  Construction  of  a  prior  distribution.  The  result  is  a  probability  distribution, 
*(f),  where  the  components  of  f  represent  picture  element  grey  levels,  locations  and  orien¬ 
tations  of  edges,  types  and  locations  of  textures,  labels  and  locations  of  objects,  and  other 
image  attributes  relevant  to  the  image  processing  task.  The  dimensionality  is  very  high,  in 
the  order  of  10s  or  10a.  This  prior  distribution  is  a  Markov  random  field,  and  is  constructed 
to  be  consistent  with  prior  information  about  such  things  as  the  spatial  smoothness  of  the 
image  intensity  levels,  the  tendency  of  textures  to  appear  in  homogeneous  patches,  and 
so-on.  This  construction  is  greatly  facilitated  by  the  equivalence  between  Markov  random 
fields  and  Gibbs  distributions;  the  Gibbs  representation  is  well-suited  for  accommodating 
the  various  types  of  prior  knowledge  in  a  consistent  manner. 

2.  Modelling  of  the  Degradation  Mecbanslm.  The  oijertolion,  y,  is  some  degrada¬ 
tion  of  the  ideal  image,  3.  The  degradation  may,  for  example,  involve  an  attenuated  Radon 
transform,  as  in  tomography,  or  a  blur  and  noise  process,  as  in  satellite  or  infrared  imaging 
Or,  it  may  simply  be  a  projection,  as  in  the  problem  of  boundary  finding  or  object  identifi¬ 
cation:  we  model  the  degradation  as  “hiding"  the  boundary  locations  or  the  object  labels 
Modelling  the  degradation  amounts  to  specifying  the  conditional  distribution,  *(y;£),  on 
the  observable  process,  £,  given  the  ideal  (and  unknown)  image  f. 

8.  Identlflcalton  of  the  Posterior  Distribution.  This  is  simply  a  matter  of  applying 
Bayes'  rule  to  »(£)  and  *(fl£)  to  derive  w(f  1  y) ,  the  posterior  distribtuion  on  the  ideal  image 
given  the  observable  process  tf. 

4.  Identification  of  the  Mode  or  Mean  of  the  Posterior  Distribution.  This  cor- 

1  Research  partially  supported  by  Army  Research  Office  contract  DAAG29-83-K-01 16, 
National  Science  Foundation  grand  DMS-8352087,  and  the  General  Motors  Corporation. 
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rasponds  to  image  restoration  and  analyaii.  If,  for  example,  £  involve!  euch  “high-level” 
attributes  ae  texture  and  object  label* ,  then  identifying  the  mode  of  w(f|y)  correspond*  to 
choosing  the  most  likely  interpretation,  in  the  sense  of  texture  and  objects  identification, 
given  the  observed  process  jf  The  posterior  mean  is  computed  by  a  hgihly  parallel  algo¬ 
rithm  called  stochastic  relaxation  This  is  a  Monte  Carlo  technique  that  yields  an  ergodic 
Markov  process,  f(t),  with  equilibrium  distribution  x(fjy).  The  mode  can  be  found  by  a 
variation  called  simulated  annealing ,  which  can  be  shown  to  converge  (weakly)  to  a  global 
maximum  of  *(£|y). 

The  utility  of  the  approach  has  been  demonstrated  by  the  results  of  experiments  with 
real  scenes.  These  illustrate:  (1)  boundary  detection;  (2)  texture  segmentation  and  labeling; 
and  (3)  single  photon  emisiion  tomography.  Details  of  these  experiments,  together  with 
theoretical  results  on  parameter  estimation  for  the  prior,  and  on  convergence  of  stochastic 
relaxation  and  simulating  annealing,  can  be  found  in  the  following  references.  These  contain, 
as  well,  discussions  of  the  many  contributions  made  by  by  other  authors  to  the  Markov 
random  field/Bayesian  framework  for  image  analysis. 
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An  Algorithm  For  Diagnosis  of  System  Failure* 


Robert  L.  Launer 
U.  F.  Army  Research  Office. 


1.  Background. 

In  this  nv.ce,  the  optimal  diagnosis  of  system  failure  is  considered. 
Suppose  that  there  is  a  system  of  n  components  C^,  Cg •  •••  •  Cn»  and  that 
this  system  becomes  Inoperable  or  falls  when  any  one  of  the  components  fails. 
In  order  that  this  problem  be  well  posed,  the  term  "component"  may  also 
represent  a  subsystem  of  units  operating  in  parallel  so  that  subsystem 
failure  occurs  when  all  of  the  components  in  that  subsystem  fail. 

The  problem  considered  here  Is  that  of  finding  the  failed  component  or 
conponents  In  the  least  possible  time,  or  cost  when  a  system  failure  occurs. 
The  testing  will  be  conducted  one  component  at  a  time,  Initially.  The  more 
general  case  will  be  considered  later.  Since  the  testing  sequence  will  be 
based  on  probabilistic  Information,  the  component  reliabilities  (or  equiva¬ 
lently  the  failure  rates)  and  the  average  time  (or  cost)  to  test  each  of  the 
conponents  are  assumed  to  be  known. 

Wong  [3]  considered  this  problem  of  finding  a  (single)  malfunctioning 
component  "such  that  the  expected  test  time  Is  optimal  in  the  sence  of 
Bellman's  principle  of  Optimality."  The  main  result  of  that  paper  Is  that 
"the  minimum  number  of  test  points  required  for  conclusive  detection  of  sys¬ 
tem  failure  Is  equal  to  the  total  nuuber  of  terminal  test  points;  this  set 
of  points  constitutes  the  optimal  choice."  No  algorithm  for  sequencing  the 
components  for  achieving  optimality  Is  presented  in  this  paper.  It  is 
pointed  out,  however,  that  the  "optimal  strategy  ..  proceeds  with  the  most 
unreliable  and  the  least  test  time  conponent  ..  as  the  first  component  to 
be  tested;  next  In  the  sequence  ..  Is  the  next  most  unreliable  and  costly 
conponent.  ..between  the  T as t  two  conponents,  an  optimal  strategy  always 
chooses  the  one  haying  a  smaller  test  time  regardless  of  their  reliability 
data." 

In  the  present  paper,  a  precise  sequencing  algorithm  Is  developed  and 
presented.  The  problem  Is  also  generalized  by  considering  multiple  failures, 
subsystem  testing,  and  the  Idea  of  allowing  a  time  for  testing  a  component 
that  has  failed  which  differs  from  the  time  to  test  when  It  has  not  failed. 
The  overall  goal  Is  to  develop  sequences  which  minimize  the  expected  value 
of  the  testing  t1~'  or  cost  for  the  several  testing  situations  considered. 

♦The  author  of  this  paper  presented  it  at  the  31st  Conference  on  the  Design 
of  Experiments. 
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2. One  Component  at  a  Tiro  Testing, 

Let  R.j(t)  be  the  reliability  function  of  the  1-th  component  at  standard 
use  conditions.  Let  and  Tj  represent  the  time  to  test  the  1-th  component 
when  It  Is  operable  or  failed,  respectively.  It  will  be  assumed  that  the 
components  fall  Independently  of  one  another.  If  the  reliability  functions 
are  continuous,  then  the  probability  of  more  than  one  failure  occurlng  at 
time  t  Is  zero.  This,  of  course,  excludes  catastrophic  failures  from 
externally  Imposed  destructive  forces  or  other  common-cause  failures.  Never- 
the  less,  multiple  failures  will  also  be  discussed. 

The  probabilities  of  component  failure  given  system  failure  are  obtained 
as  follows.  The  probability  of  component  1  surviving  until  time  t  Is  R . ( t ) . 

f  1 

This  may  be  computed  explicitly  from  R^t.)  ■  exp (-/£  h(u)du).  Let-S^.  repre¬ 
sent  the  event  that  the  system  does  not  survive  beyond  time  t,  andC1  the 
corresponding  event  for  component  C. .  Then  the  following  equality  Involving 
conditional  probabilities  holds: 


Plslc^  P(C1  ]  -  Ptc1  Is]  PtSl 

From  the  previous  assumption  about  component  failures,  the  system  falls  when 
any  conponent  falls  so  that  PCS |Ci 1  ■  l,and 


PIC1 Is]  -  PtCjl  /  Pis) 

This  Is  the  conditional  probability  of  the  failure  of  component  1  given  system 
failure  (at  time  t).  Let  this  probability  be  denoted  by  p^.  Then  from  the 
previous  assumptions  It  fellows  that 


n 

p,  ■  (1-RJ  n  R,  /  i  (l-R,  )  n  ( l-R . ) 
1  1  jiM  J  k-1  *  j/k  J 


(1) 


Suppose  that  the  system  has  failed  and  that  the  components  are  tested 

one  at  a  time  In  the  order  I,  2,  3,  ...  until  the  defective  component  is  found 

at  which  tine  testing  Is  terminated.  The  initial  ordering  of  the  components 

is  arbitrary.  The  expected  test  time,  E,  Is  then 

,  n  k-1  ,  k-1  n  n 

E  -  T ,p.  +  U[l  T^]  11  ( 1-p.  )p1 }  +  (  £  TJ  (  n  (1-pJ)  (2) 

k=v  1«l  '  Ivl  1*1  1  1*1 

Let  E'  represent  the  expected  test  time  when  the  order  of  the  k-th  and  the 
(k+l)-th  components  are  interchanged  and  all  others  remain  the  same.  The 
difference  E '  -  E  is  easily  sf»en  to  be, 


(3) 


E'"E  “  pkpk+l  JB(11"p1)^Vl/pk-*-l)‘(Tk/pk)"(Tk+rTk+l)+(Tk"Tk)1 

I 

The  expected  testing  time  Is  decreased  by  this  permutation  If  E  -E  is  negative. 
Using  a  finite  Induction  argument,  then  the  optimal  ordering  Is  found  by 
computing  the  n  quantities, 

6k  •  <W-<VV  <4> 

for  each  component  and  order  the  Gk  beginning  with  the  smallest  and  ending 
with  the  largest.  Examination  of  the  first  and  last  terms  In  E  Indicate  that 
the  ordering  scheme  (4)  also  applies  to  these  terms. 

The  optimal  expected  test  time  Is  obtained  from  (2)  with  the  terms 
arranged  In  the  optimal  order,  but  without  Including  the  last  term  since  It 
would  be  unnecessary  to  test  the  "last"  component  If  the  other  n-1  were 
tested  and  found  to  be  operative. 

I 

Notice  that  If  the  terms  and  T^  are  equal,  then  the  Gk  are  easily 
seen  to  correspond  to  the  Intuitive  feeling  that  the  components  with  shorter 
testing  time  and  higher  failure  probabilities  should  be  tested  first  generally. 
3  Multiple  Failures  With  One  At  A  Time  Testing 

The  case  of  multiple  failures  Is  considerably  more  complicated  than  the 
single  failure  case.  There  Is  first  of  all  the  problem  of  determining  the 
multivariate  failure  law,  which  would  yield  the  conditional  failure  failure 
probabilities  corresponding  to  (1)  In  the  simpler  case.  The  derivation  of 
this  set  of  probabilities  should  be  based  on  the  physics  of  the  particular 
situation.  In  the  absence  of  specific  Information  one  might  use  compound 
probabilities. 

Another  complicating  factor  Is  how  testing  and  repair  Is  to  be  conducted. 
If  all  of  the  failed  components  are  to  Identified  before  any  repair  begins, 
then  exhaustive  testing  would  be  Implemented  In  which  case  the  testing 
sequence  Is  Irrelevant.  If,  however,  testing  proceeds  one  component  at  a 
time  until  a  failed  one  Is  found,  followed  by  Immediate  repair  of  that 
coirponent  with  further  testing  following  the  repair  only  If  It  Is  required, 
then  the  testing  sequence  is  Important.  The  following  development  treats 
the  latter  case. 

Assume  for  the  moment  that  the  system  In  question  Is  known  to  contain 
exactly  m<  n  failed  components.  The  expected  repair  time  for  this  case  can 
be  written  explicitly.  It  can  be  analyzed  similarly  to  (2).  The  result  has 


been  worked  out  for  the  cases  m»2  and  m»3  and  may  be  described  in  the  following 
way.  The  testing  order  of  the  first  m  components  does  not  effect  the  testing 
time.  The  remaining  n-m  components  should  be  tested  In  the  order  dictated  by 
(4).  The  general  case  was  not  worked  out  because  of  the  Inordinate  amount  of 
algebra  Involved.  The  lower  order  cases  Indicate  no  surprises  for  the  higher 
order  ones. 

The  point  of  this  discussion  Is  that  If,  unknown  to  the  tester,  the 
system  contains  more  than  one  failure,  the  procedure  given  by  (4)  will  still 
result  In  an  optimal  or  near  optimal  sequence  If  continued  testing  Is  Indi¬ 
cated  by  system  malfunction  after  the  first  failed  component  has  been  found 
and  repaired.  Naturally,  system  "turn-on"  after  repair  could  Induce  a  failure 
among  the  previously  tested  components.  Without  appropriate  data  or  probab¬ 
ilistic  Information  about  this  phenomenon,  no  definitive  guidance  can  be  given 
about  optimal  or  reasonable  strategies  to  protect  against  it. 


4.  Subsystem  Testing 

It  seems  reasonable  to  ask  what  further  saving  In  testing  time  can  be 
realized  by  simultaneously  testing  components  In  groups  If  that  Is  possible. 
For  example,  If  half  of  the  components  In  a  system  could  be  tested  together 
In  a  reasonable  period  of  time,  followed  by  testing  smaller  subgroups  or 
single  components  when  appropriate,  It  would  appear  that  the  expected  testing 
time  could  be  further  reduced,  especially  if  only  one  component  has  failed. 

Assume  that  the  system  In  question  yields  a  natural  decomposition  Into 


M  subsystems  or  modules  M^,  M2,  ,  ..M^.  Module  k  consists  of  n(k)  components, 

and  Its  reliability  Is  given  by  Qk.  The  average  time  to  test  module  k  as  a 

single  entity  (that  Is,  exclusive  of  any  component  testing)  is  Uk  If  it  Is 

operational  and  U.  If  not,  while  the  corresponding  average  times  for  component 
K  k  '  k 

J  of  the  k-th  module  are  rj  and  T  j.  The  probability  that  module  k  has  failed 
given  system  failure,  Pk,  Is 

M 


\  •  (1-QJ  n  Q,  /  e  ( l-Qj )  n  Q. 
k  k  j*k  J  1-1  1  W  J 


The  probability  that  component  j  has  caused  module  k  to  fall  Is  given  by  (1) 

I 

where  the  p^,  and  T^  are  restricted  to  the  components  of  module  k. 

Corresponding  to  the  previous  testing  set-up,  It  will  be  assumed  that 
testing  proceeds  one  module  at  a  time  until  the  failed  module  is  discovered. 


Then  the  Individual  components  are  tested  one  at  a  time  until  the  failed 
component  Is  found. 

Let  represent  the  vector  »Um-l,'Jm^  *  -j  rePresent  the  vector 

(Tj,»Tji..,Tj_i,Tj)  and  1^  the  k-vector  each  component  of  which  Is  a  1.  Note 
that  the  transpose  of  a  matrix  or  vector  will  be  denoted  by  a  superscript  T. 
Further,  let  ^  represent  the  event  that,  given  system  failure,  modules  1 
through  m-1  were  found  to  be  operational  and  module  m  was  diagnosed  as  failed. 
Let  £j  represent  the  event  that,  given  failure  of  module  m,  components  1*2,. . J- 1 
were  found  not  to  have  failed  and  component  j  was  diagnosed  as  failed. 

Then  for  an  arbitrary  ordering  of  modules  and  components,  the  expected 
testing  time  E  Is 

E  +  ^-j1  P(tX’  (6) 


E-E 


If  the  m-th  and  (nH-l)-th  modules  are  interchanged,  and  the  quantity 
Is  computed  as  was  done  in  section  2,  the  minimizing  algorithm  Is 


obtained.  That  Is,  the  quantities  Hj  are  obtained; 


ej  + 


«vv 


VPJ 


(7) 


where  Ej  represents  the  average  time  to  complete  the  one  at  a  time  testing 
of  the  components  In  module  j.  The  algorithm  Indicates  that  optimization 
of  modular  testing  depends  on  the  optimization  of  the  component-wise  testing 
within  each  module,  but  both  optimizations  are  obtained  Independently  of 
the  other. 

The  optimizing  algorithm  Is  therefore  to  order  the  components  In  each 
module  according  to  (1)  within  the  module,  and  then  to  compute  the  Hj  for 
each  module,  >1,2,..., M.  The  testing  proceeds  by  diagnosing  the  module 
corresponding  to  the  smallest  value  of  the  Hj,  followed  by  the  module 
corresponding  to  the  next  smallest  value  of  Hj,  and  so  on  to  the  module 
corresponding  to  the  largest  value  last. 

It  Is  useful  to  ask  when  component-wise  testing  within  a  given  module 
Is  more  efficient  than  modular  testing  for  that  module.  A  good  rule  of  thumb 
Is  to  use  that  method  which  requires  the  lesser  overall  average  testing  time. 
This  leads  to  the  following  algorithm.  If  the  following  Inequality  (8)  holds 


then  use  modular  testing. 


.  n(j) 

fS  UJ  " 


(8) 


It  should  be  pointed  out  that  there  are  several  loose  ends  related  to 
this  discussion  which  should  be  kept  In  mind.  First,  the  given  algorithms  are 
optimal  In  the  sence  of  lowest  expected  testing  time  under  certain  restricted 
conditions.  The  most  general  test  situation  would  allow  for  an  unrestricted 
mix  of  any  combination  of  single  components  and  subsystems,  whether  these 
are  natural  subsystems  or  not.  There  are  2n-l  such  possible  subsets  to  con¬ 
sider  In  various  combinations.  This  would  require  a  prohibitively  large 
amount  of  computer  time  for  even  a  moderately  small  system. 

Another  possible  area  of  further  exploration  Involves  the  computation  of 
the  component  reliability  functions  at  every  new  failure.  Unless  the  failure 
rates  are  unusually  well  behaved,  such  as  all  constant  failure  rates,  the 
quantities  and  must  be  recomputed  at  each  failure.  With  constant  fail¬ 
ure  rates  for  example,  the  crossings  of  the  reliability  functions  could  be 
computed  once  and  for  all  yielding  a  set  of  (n(n+l)/2)  time  zones  of  consider¬ 
ation. 

Finally,  there  Is  the  question  of  data  and  prior  or  partial  Information. 

The  ,  T.j  and  so  forth,  might  not  be  known  exactly.  Moreover,  If  other  pH  or 
Information  Is  available,  It  certainly  should  be  incorporated  Into  the  analysis. 
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INDIVIDUAL  VERSUS  GROUP  SAMPLING* 


Paul  A.  Roed  iger 
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Dover,  New  Jersey  07801 


ABSTRACT :  Lot  acceptance  based  on  INDIVIDUAL  sampling  has 
He^n  widely  used  during  the  past  decade.  Recently  it  was 
recommended  that  this  practice  be  discontinued  and  Future 
sump  ling  be  done  on  a  GROUP  basis,  The  need  Tor  specific 
conversion  guidance  and  procedures  was  thereby  created  A  model 
assuming  the  Family  of  negative  log  gamma  distributions  on 
incoming  INDIVIDUAL  quality  rates  has  been  developed  for  the 
purpose  of  selecting  the  GROUP  plan  most  comparable  to  a  given 
INDIVIDUAL  plan.  In  addition  to  the  model  details,  examples  are 
presented  and  a  previously  published  alternative  is  discussed. 

1 .0  INTRODUCTION 

In  lot-by-lot  attributes  sampling  inspection,  product  is 
divided  into  inspection  lots  and  random  samples  ure  drawn  from 
euch.  We  assume  there  are  m  quality  characteristics  each  huving 
well-defined  attribute  requirement,  i.e.,  a  requirement  which  is 
either  met  or  is  not.  A  unit  not  in  conformance  with  the  j-th 
requirement  is  called  a  j-type  defective.  A  non  conforming  unit 


*The  authors  of  this  paper  presented  It  at  the  31st  Conference  on  the  Design 
of  Experiments. 
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with  reaped  to  one  or  more  requirement!  ie  called  a  defective. 
Two  sampling  modes,  one  based  on  defectives,  called  GROUP 
sampling,  and  the  other  based  on  the  m  defective  tvpes,  called 
INDIVIDUAL  sampling,  are  described.  Both  are  permitted  in  [ij, 
MIL- STD- 1 05D , 

Let  d  be  the  number  of  defectives  obtained  in  a  sample  of  N 
units.  We  eay  that  sampling  is  done  in  the  GROUP  mode  when  the 
decision  rule  to  accept  or  reject  the  lot  is  based  only  on  d, 
without  further  regard  to  defective  types  therein.  In  practice 
GROUP  sampling  is  implemented  by  the  following 


I',* 


RULE  Q:  ACCEPT  LOT  IF  d«C,  OTHERWISE  REJECT. 


The  numbers  N  und  C  are  called  the  "sample  size"  and 
"acceptance  number"  of  the  GROUP  plan.  Such  plans  are  denoted  by 
(N.C).  Note,  the  GROUP  criterion  ignores  underlying  defective 
types  entirely.  For  n ow .  "reject"  stands  for  any  course  of  action 
taken  on  lots  not  accepted. 

Let  p  be  the  true  lot  fraction  defective  and  q-1-p.  Then, 
the  GROUP  probability  of  acceptance  (  PA(. )  ,  in  binomial  form,  of 

lots  of  quality  q  is 


KiK 


(l.D 


PA0  ( <>  )  -  OC(  q  ;  N  .  C )  -  'll-,)1 


.V.\V 


OC,  for  convenience  treated  as  a  function  of  q  instead  of  p,  is 
called  the  Operating  Characteristic  (OC)  curve  of  (N,C). 

The  second  type  of  sampling,  used  in  many  current  U.S.  Army 
commodity  specifications,  is  INDIVIDUAL  sampling.  Let  d.  be  the 

number  of  j-type  defectives  found  in  a  sample  of  n  units.  In  this 
mode,  lot  acceptance  is  based  only  on  the  d^’s  anc*  *s  typically 

invoked  via  the  foil  ow  i  n  g 

RILL  I  :  ACCliPT  LOT  I  lr  P.AC1I  d.«  c,  j-1,2 . m, 

OTHLKWI  SIZ  RliJKCT. 

The  numbers  n  and  c  are  called  the  "sample  size”  and 
"acceptance  number"  of  the  INDIVIDUAL  plan,  which  is  denoted  by 

(n,c)m.  Let  p.  be  the  true  j-type  defective  rate  and  q  .  =  1  -  p  .  .  The 

1 ND I V I DUAL  probability  of  acceptance  ( PA  ^  )  of  lots  with  quality 

profile  0  =  U|  ,  .q2 . Mm)  is 

m 

(1.2)  l>A  |  (  0 )  -  J|OC(q.;n,c)  . 

j  =  1 

Note,  PAj  is  not  a  function  of  the  one  parameter  q,  as  is 
PAj,'.  but  is  instead  a  product  of  GROUP- like  (X'  curve  terms. 

lor  a  given  profile  Q,  the  overall  lot  quality,  assuming 
independence  among  the  m  defective  types,  is  given  by 
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Best  Available  Copy 


m 

0.3)  q  -  II  qj 

j-i 

This  equation  eatabliihei  the  basic  connection  between  the 
two  sampling  approaches,  relating  q,  the  GROUP  quality  of  (l.l), 
with  the  q^s.  the  INDIVIDUAL  qualities  of  (1.2). 

The  following  conversion  problem  is  considered: 


PROBLEM  ”  P  ” :  GIVEN  Till;  (  n  ,  c  5m  INDIVIDUAL  PLAN  . 

FIND  TUI!  "BUST  (N.C)  GROIP  PLAN  REPLACEMENT. 


The  inverse  problem  is  apparently  more  complicated,  but,  in 
principle,  can  be  back-solved  h  v  iteratively  solving  a  converging 
sequence  of  problems  of  the  tvpe  posed. 

The  two  approaches  share  a  curious  history.  GROUP  sampling, 
once  the  aathorired  method,  w  o  s  eventually  replaced  bv  the 
I NDI V 1  DUAL  method.  This  development  was  an  outgrowth  of  a 
computer  revolution  that  helped  promote  a  component  oriented 


approach  to  system  reliobilitv.  Subsequent  years  have  seen  more 
than  just  a  balancing  of  this  trend;  indeed,  a  steady  return  to  a 
more  integrated  "systems”  approach  has  ensued.  With  it.  interest 
in  GROUP  sampling  has  growm,  to  the  point  that  direction  was 
recently  given  in  (4)  to  discontinue  the  use  of  1ND1VI DUAL 
sampling  altogether.  Unfortunately,  a  sound  conversion  rationale 
doe*  not  exist.  Several  sets  of  tables  prescribe  GROUP  acceptable 


m 

I 


V7J 
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representing  the  other  extreme  where  all  incoming  defective  rates 
are  equa 1 . 

As  q  is  allowed  to  vary  on  (O.l],  the  bounds  of  (2.1)  produce 
an  envelope  that  contains  all  possible  PA  j ( Q  j  q )  values.  A  somple 

envelope  resulting  from  the  (125, l)14  INDIVIDUAL  plan  is  depicted 
in  Figure  1.  The  importance  of  the  envelope  is  stated  in  the 
following  : 

CUM CL US ION :  A  CANDIDATH  GROUP  OC  CURVK 

MUST  UL  CONTA1NH1)  WITHIN  Till:  KNVIiLOPH. 
THLRHFORH ,  L(q)  *  OC(q;N,C)  «  U(q)  . 


The  envelope  coll  ups  es  if  and  only  if  m-1  or  c»().  In  both 
cases,  INDIVIDUAL  and  GROUP  sampling  ure  identical,  provided  of 
course  that  (  N  ,(')-(  n  ,  c  )  . 

3.o  MUDLL  KUQllKLMI.NTS 

Where  exactly  within  the  envelope  should  the  "best"  GROUP  (X? 
curve  be  located0  To  help  guide  us,  a  model  is  proposed  that 
relies  on  probability  distributions  used  as  weighting  functions. 
The  model  has  two  desirable  properties:  it  is  general,  taking 
into  account  important  aspects  of  the  problem,  yet  tractable, 
allowing  computations  to  be  carried  out  and  simulated  in  terms  of 
k  n  own  statistical  quantities. 

The  following  are  utilized  as  part  of  the  model: 


the  Beta  probability  density  function  (pdf), 


(3.1) 


bet(i;a.b)  •  x  (!•*)  /B(a.b)  . 


and  the  Negative  Log  Oaimta  pdf. 

(3.3)  a  ) g( x ; a , b )  -  abx * ‘ 1 { I n ( 1 / x  ) ) b ' 1 /  r  ( b )  . 

where  Ht)  -  [  t*"*e"tdt,  B(a,b)  ■  r(a)r(b)/r(o-b)  , 

*  o 

and  z-O,  tt-O,  b>0  and  (t «  x  %  1  . 

The  cumulative  distribution  function!  (cdf’i)  obtained  by 
integrating  bet(t;a.b)  and  n)g(t;a,b)  with  respect  to  t,  tt  |  0  ,  x  ]  . 
are  denoted  by  BET(x:a.b)  and  NLG(x;a,b).  re  spec t  i  vt I v . 

The  negative  log  ganemi  dent  its  clcselx  resembles  the  more 
familiar  beta.  In  fact,  (3.2)  is  obtained  from  t.3.1)  by  replacing 
(1-x)  with  ln(1'x),  two  nearly  equal  terms  for  x  cloae  to  I,  and 
adjusting  the  constant  term  to  normalize  the  integral,  lor  fixed 
"a",  this  familv  of  densities  has  the  special  feature  of  being 
closed  under  multiplication,  i.e..  if  L  - n 1 g  (  x ; a . b  ) , i - 1 , 2 ,  then 

the  product  L  •  n  1  g(  x  ;  a  .  b  j  )  •  The  family  is  also  a  rich  one, 

taking  on  a  wide  variety  of  shapes  including  the  ”L*‘ ,  "L"  and 
"JM  shaped,  uniform  and  uni*modal  densities.  Its  name  is  derived 
from  the  fact  that  Y  it  negative  log  gamma  distributed,  if  and 
only  if,  *lnY  is  gsnvna  distributed.  References  |S]  and  (6) 
contain  more  details  about  this  distribution. 


The  model  specifics  weighting  p rtf’*  on  tlie  q^'i 


(4.1)  f^qj)  ■  olg(q.;a,bj)  •  ^here  e>0  and  b,,0.  i-1 .2 - 


In  order  to  randomly  generate  vectors  Q  given  q,  the 
conditional  cdf  of  an  arbitrary  qk«  given  q  and  possibly  some 

all.  of  the  other  q  ^  •  must  be  determined.  The  desired  cdf’e, 

developed  in  Appendix  2.  are,  for  k-1 , 2 , . . . ,m- 1 , 


(4.2)  °k(qk  q,ql,q2 . qk-lJ  “  BET(Tk(qk) :Sk+l ,bk)  ' 


where  Tk(qk)  -  1 n(qk /Pk ) / 1 n( 1 /Pk  )  ,  pk<qk41  ’ 


m 

SrXbi  and  pi"  q*  pj  "  <1/  n  qj  .  U)<rn. 

i-J  i<J 


Equation  (4.2)  provides  the  basis  for  the  following 
procedure:  for  k-1, 2 . m- 1  ,  take 


(4.3) 


"k  *  (pk> 


’-■>ET'  ‘(Rk'Sk,l-V 
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where  the  R^’s  are  random  numbers  generated  from  a  uniform 
distribution  on  [0,1],  Once  the  firat  (m-1)  q^’s  >re  generated, 

la  a  imp  1 y  P  . 
m  m 

Implementation  of  procedure  (4.3)  a  1  I  owe  us  to  study  the 
diatribution  of  PAf(Q  q)  via  Monte  Carlo  simulation  methods. 

5.0  PARAMETER  SELECTION 

The  pdf  on  q  resulting  from  (1.3)  and  (4.1)  is 
(5.1)  f(q)-nlg(q;a,Sj). 

An  interpretation  of  "a"  is  found  bv  taking  the  expected 
vulue  of  (5,1),  giving  E(  q  )-u  /  (  u  +  1  )  .  so  that  a-E(  q  )  '  [  1  *  F.(  q  )  ] ,  the 
odds  of  randomly  picking  on  effective  unit  when  quality  is  at  its 
uveruge.  Consequently,  in  moat  practical  applications,  "a"  will 
be  quite  large.  Note,  however,  (4.2)  and  (4.3)  are  independent  of 
this  pur ame  ter. 

Of  particular  interest  to  us  are  the  " J "  shaped  pdf’s  that 
result  when  u>l  and  b-1.  Then,  (5.1)  defines  a  or.  e  -  pa  r  ame  t  e  r 
family  which  is  deemed  sufficiently  rich  for  the  purpose  of 
assigning  nppropriote  weights  to  q.  Moving  no  prior  information, 
the  b  j  *  s  are  assumed  to  be  equal.  Since  they  sum  to  b-1,  bj-Wm 

for  1-1,2,  ...,m,  yielding  J-shaped  weights  on  the  q .  ’ s  which  are 

a  sympto  tic  at  q . -1 . 


A  three  step  approach  to  solving  problem  ”P”.  full} 
computerized  and  documented  in  (8],  will  now  be  described. 
STEP  1:  Monte  Carlo  simulation- 

(a)  Choose  K,  the  number  of  distinct  q'k  at  which 
to  simulate  PA j ( Q | q ) .  We  take  K-19. 

(b)  Select  an  appropriate  q-interval  [ u ^ , u ^ ] .  We  utilize 
the  criteria  U(Uj)*.10  and  l.(u^)>.95  ,  ensuring 


m 

Lu1 \t. 
I’t.Vf. 


that  PAj (Q | q-u , ) < . 1 0  and  PA | ( Q | q-u^ ) > , 95  . 

(c)  Define  the  equi-spaced  intermediate  points 

u  .  »»u  .  +(  u^-  u  .  ) '  (K- 1  ) ,  for  1-1,2 . K-2. 

(d)  Generate  I  sum  random  vectors  Q | q — u . ,  per  (4.3). 
Our  simulations  utilize  I  sum- 1000  repetitions. 

(e)  Obtain  the  empirical  density  of  PA . ( Q I q  —  u .  ) . 


(f)  Compute  the  50  percentile,  and  call  it  Vj  (y.  if  q-u() 

Other  percentiles,  namely  .(>,.1,. 2, .3,  .4,  .6,  .7,  .8,  .9 
and  l.o,  are  computed  and  processed  an  are  the  medians. 
However,  we  do  not  consider  the  resulting  GROUP  plans 
to  be  as  useful  simply  because  the  risks  to  producer 
and  consumer  are  unbalanced. 

(g)  Repeat  (d)  thru  (f)  using  u  ^  ,  u  ^ . UK  'nste°d  01  uj- 

(h)  Obtain  { ( u .  . y  4  ) ) ,  i-l,2,...,K. 


sfe 


m 


kk 


(i)  If  the  y .  *  s  are  increasing,  proceed  directly  to  step  2. 

If  not,  and  thia  cate  has  never  occurred,  either 
increase  the  the  number  of  trials  leum,  decrease  the 
number  of  points  K,  or  open  up  the  interval  [upU^], 

STEP  2 ;  Interpolat ion- 

(a)  Linearly  connect  the  points  (Uj.yj),  i-l,2,...,K. 

Call  this  increasing  piecewise  linear  function  y-f(q). 

(b)  Use  inverse  interpolation  to  find  six  unique  q  values, 

Uj  thru  u6,  corresponding  to  y^-f(Uj)  thru  y^-ffu^, 

where  y j - . 1 .  y2«.3,  y^-.S,  y ^ . 7 ,  y^-.9  and  yfc-.95. 

STEP  1:  Find  the  "best"  (N,C)  approximation- 

(a)  Define  a  range  (Gni  n  ,Qna x  ]  for  ('. 

We  take  On l n-max( 0 , c - 5 ) ,  Qma x-Cmi n  + 1 o  and  begin  the 
search  with  C«Qni  n  . 

(b)  Permissible  (N.C)  are  required  to  satisfy 

OC(  u  f  :  N  , C )  <  y  j  +  e  j  und 

OC(u6:N ,C)>y6-e6 

where  e,,e,  are  two  small  positive  constants.  The  use 

of  perturbed  values  (Cj.e^O)  helps  ensure  that  the 

" best"  N,C  combination  is  not  eliminated  at  the  start 
of  the  search.  In  the  terminology  of  Hold  ([3],  pp  25). 
(N.C)  is  said  to  be  "stronger"  than  a  plan  whose 

OC  curve  passes  thru  the  two  points  (Uj.yj+ej)  and 


(u^.y^-e^).  We  utilize  ej-e^-,2. 


(c)  Find  the  largest  interval  t(C)  such  that  the  conditions 
of  (b)  are  met  for  all  N  (  1(C).  Approximate  formulae 
developed  in  [3],  pp  51,  are  used  to  determine  the  exact 
interval.  If  1(C)  is  empty,  proceed  directly  to  (e). 

5  I 

(d)  Find  the  N  that  minimizes  De  1  (N  ,C)-^~  OC(  ti  {  :N,C)-y  ■ 

1-1 

for  N<I(C).  Call  it  Nr.  Note,  Del  does  not  depend  on  u 

(e)  Repeat  (b)  thru  (d)  for  C-Cmin+1 . Cmax, 

(f)  Obtain  a  final  set  of  candidate  plans 

<(NC,C)|  C  €[Cmin;Cmax] ,  1(0*^} 

(g)  Find  the  C  that  minimizes  Del(N^,C)  , 

for  C  € [Cmln.Gnax ]  ,I(C)*^.  Call  it  C* . 


(h)  Obtain  the  "best"  GROUP  plan  (N.C).  namely  (N'c*  ,C  ). 


7.0  EXAMPLES  AMD  HLSCLL1SJL0X 

The  above  three  step  procedure  will  be  designated  method  n 
(for  "best").  The  "best"  (N,C)  will  be  culled  the  B-plan. 

Several  examples  provide  a  setting  for  our  discussion  of 
method  B. 

First,  consider  Pj  :  Given  ( n , c )m  -  (125, l)'4, 

Find  the  "best"  (N.C). 

Obtain  its  B-plan  :  (N,C)  -  (94,1). 
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A  partial  aumnary  of  almulatlon  data  (steps  lh  and  2b)  along  with 
approximating  B-pltn  OC  curve  valuei  are  preaented  in  Table  1. 


I 


Table  1 .  B-pl an  for  P, 


Median 

PA, <Q| q ) 

B-plan 

OC( q ; 94 , 1  ) 

.04728 

.058647 

.06110 

.071625 

.07410 

.087243 

.09319 

.105966 

.10000 

.  1  12802 

.11415 

.  1  283  1  4 

.14505 

.154860 

.17526 

.186224 

.  20826 

.223054 

.25857 

.265998 

. 30000 

.  308413 

. 30661 

.315661 

. 36534 

. 372537 

.9575 
.9600 
♦ .9608 
.9625 
,9650 
.9675 
.9700 
.9725 
*  .9747 
.9750 
.9775 


*  Inte  rpo luted  valuei 


.9800 
* .9822 
.9825 
.9850 
.9875 
* .9881 
.9900 
.9925 
* .9942 
.9950 
* .9962 
.9975 


Median 

PA, (Q  q) 

B-plan 

OC(  q  ;  94 , 1  ) 

.43265 

.436909 

. 50000 

.499582 

.50934 

,508711 

. 59520 

.587320 

.67824 

.671285 

.70000 

. 691709 

. 77007 

.757932 

. 85U72 

.  842845 

.90000 

. 894978 

.92500 

.919145 

.95000 

.948909 

.97897 

.976533 

Method  B  ha»  been  detilgned  apeclflcall.v  to  be  a  fair 
conversion  strategy,  suitable  to  both  producer  and  consumer.  Th 1 » 
Intention  is  particularly  reflected  in 

S t  e p  if:  Skewness  in  the  simulated  data  convinced  us  that 
the  B-plan  should  approximate  the  set  of  median,  not  mean, 

PA |  values.  As  such,  the  B-plan  rejects  more  often  than  the 

INDIVIDUAL  plan,  for  half  of  the  profiles  Q  considered  In 
the  simulation,  and  accepts  more  often  for  the  other  half. 

In  this  sense  the  producers  and  consumers  risks  associated 
with  the  conversion  are  equalized. 

Stapa  la  thru  lc:  A  fair  GROUP  plan  should  provide  close 
approximation  throughout  the  low,  middle  and  high  ronge  of 


reft 


5 

6 


median  PAj  values.  Our  choice  of  q-interval  and  fine 

discretization  of  it  into  K-19  equi-spaced  points  ensures 
that  the  simulated  median  PAj’s  will  cover  the  entire 

spectrum  of  i nterest . 

Step  2 :  A  fair  GROUP  plan  should  also  give  equal 
consideration  to  the  low,  middle  and  high  range  of  median 
PAj  values.  Unfortunately .  it  is  not  possible  to  choose  the 

Uj's  in  advance  so  as  to  get  a  balanced  set  of  median  PAj 

values.  For  example,  the  raw  (un-interpolated)  data  of 
Table  ),  with  almost  half  (9/19)  of  its  simulated  medians 
below  0.25,  is  considerably  biased  toward  low  PAj  values. 

Were  a  GROUP  plan  fitted  to  the  raw  data,  u  (95,1)  11- plan 
would  result,  producing  a  fit  thut  is  slightly  better  than 

(94,1)  in  the  low  PA j  range,  but  worse  elsewhere.  For  this 

reason,  the  B-plun  has  been  based  on  u.'s  corresponding  to 

the  more  balanced  set  1.1, .3, .5, .7, .9)  of  Interpolated 
median  PA j  values.  The  insensitivity  as  to  which  data  base 

is  used,  raw  or  interpolated,  is  typical  and  reassuring. 
Step  3d:  Del,  the  sum  to  be  minimized,  attaches  equal 
weight  to  the  approximation’s  lack  of  fit  ut  each 
interpolated  data  point. 


The  set  of  candidate  plans,  including  the  (94,1)  B-plan, 
along  with  their  scores  Del  (step  3d),  are  presented  in  Table  2 


•»v»: 
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Table  2.  Candidate  B-plans  for  Pj 


.35901 

.03495 

.14799 

.26565 


.34854  II  9 


.41117 

.46023 

.50000 

.53313 

.56060 


The  N.,  and  C  of  Table  2  are  highly  correlated,  with  r-. 999996 


and  regression  N(,  -  56C+38.2.  In  general,  candidate  B  plan  OC 


curves  are  naturally  forced  to  pivot  about  (Uj,.5),  the  fixed 


"Indifference  point"  (IP)  determined  in  step  2b.  How  closely  the 


(X'  curves  approximate  the  IP  depends  on  the  other  u.’s 


especially  when  C  is  small,  but  their  effect  diminishes  rapidly 
as  C  increases.  Bused  on  the  IP  only,  Hold  shows  in  [3],  pp  195, 


that  -  aC+b,  where  a-l/(l-u3>  and  b-( 1 +u^  ) / ( 3 - 3u 3  )  .  By  taking 


u,-.9822,  a-56.18,  b-37.12  and  rounding  to  the  nearest  integer. 


the  N^’s  of  Tuble  2  are  duplicated,  except  when  C-0 ,  1  unil  2, 


where  you  get  37,  93  and  149  respectively.  This  correlation  can 


be  exploited  to  economize  the  search  for  candidate  plans,  but, 


depending  only  on  u^,  does  not  constitute  per  se  a  reliable 


shortcut  approach 


Before  other  examples  are  presented,  an  alternative 


conversion  method  forming  the  basis  of  [2]  is  described.  It 


consists  of  taking  N-n  and  letting  C  be  the  smallest  \  satisfying 
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OC(AQLra;N,X)  »  OC(AQL;n,c),  where  n,c  and  m  are  epecified  and  AQL 
ia  defined  by  OC(AQL;n,c)«  .95  (or  .90).  The  intent  here  is  to 
accept,  with  high  probability,  incoming  product  whose  qual  i  ty 
characterietlca  ore  all  at  AQL .  From  a  consumer  point  of  view, 
such  an  approach  is  intuitively  unacceptable.  The  version 
proposed  in  [2],  designated  here  as  method  A  (for  "alternative”), 
limits  (n,c)  and  (N,C)  to  be  Mil.- STD- 1  OSD  pluns,  and  utilizes 
tabulated  AQI.  values  Instead  of  exact  ones.  (N,C)  determined  in 
accordance  with  method  A  will  be  called  an  A-p)an. 

Table  3  presents  sample  conversions  obtained  by  the  two 
methods,  for  seven  INDIVIDUAL  plans  having  nominal  AQl.’.s  of  .996 
(  .  49b  AQL  In  ( 1  ]  )  ,  at  two  values  m-  3  and  m-  1  4  . 

Table  3.  Comparison  of  A, H- plans 


m**  3 

m-  1  4 

A-  p 1  an 

B-  p  1  a  n 

A-  p  1  u n 

B- p 1 un 

n  •  c 

N  -C 

N  -C 

N  -C 

s  -c: 

32-0 

125-1 

200-2 

315-3 

500-5 

800-7 

1 250-10 

32-2 
125-5 
200-5 
315-7 
500- 10 
800- 2  1 
1250-21 

32-0 
104-  1 
160-2 
247-3 
389-5 
624-7 
973-10 

* 

32-5 
125-14 
200- 2  i 

V 

3  2-0 

4  94-1 
140-2 
213-3 

3  2  9  -5 
519-7 

804  -  10 

*  A, D -plans  for  P^,  also  depicted  in  Figure  1 

Method  B  results  imply  that  producer  and  consumer  risks  ure 
more  naturally  balanced  by  taking  C-c  and  N<n,  thun  taking  N-n 
and  C>c,  as  suggested  in  [2].  Also,  method  H  conversions  produce 
average  sample  reductions  (c>0)  of  204b  and  309fc  in  the  m-3  and 
m-14  cases  respectively,  as  compared  to  no  anticipated  method  A 
reductions,  except  those  incidental  to  the  tabular  limitations  of 
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Ill,  vi».,  the  repeating  (200,21)  A-plan.  Note  alao  that  method  A 
mieeea  the  only  eentible  conversion  when  c«0,  namely  (N,C)-(n,c). 

Figure  1  compare!  the  A  and  B-plan  OC  curvet  of  Table  3  (*), 
ehowing  them  in  relation  to  the  envelope  (L(q).U(q)]  defined  by 
(2.1). 


m  14 

<r.,c)  ■  (145,1) 
ALT(A)  i  (145,14) 
BEST (6) t  (94,  1) 


.40  .64  .64  .86  .88  .88  .94  .94  .96  .98  1.80 

X  Effective  (q) 


Figure  1.  A  ,  B  -  p  1  a  n  s  for  P  j ,  with  tnv elope  ( 1 . . 1] 


The  A-plan  for  Pj  is  highly  inconsistent  with  the  original 

INDIVIDUAL  plan:  at  such,  method  A  is  not  a  viable  alternative. 

Finally,  Table  4  ehows  how  B-plans  change  with  m,  for  two 
different  (n,c)  INDIVIDUAL  plans. 


/  HHEjn 


wAf.r,  *“•  '  </  >.*  s'  s'  *.*  *.*  s'  s' 


Table  4,  B-plans,  varying  m 


m-2 

m-3 

m-5 

m-7 

m-10 

m- 1 5 

m-20 

n  -  c 

N  -C 

N  -C 

N  -C 

N  -C 

N  -C 

N  -C 

N  -C 

50-1 

44  - 1 

42-1 

40-1 

39-1 

38-1 

38-1 

37-1 

315-3 

271-3 

246-3 

227-3 

222-3 

218-3 

213-3 

210-3 

As  m  increases,  INDIVIDUAL  sampling  loosens  and  becomes  less 
discriminating:  with  C-c,  GROUP  sampling  can  mimic  this  behavior 
by  decreasing  N  from  n  to  c  as  m  goes  from  1  to  infinity. 

8.0  CONCLUDING  REMARKS 

A  general  model  has  been  described  that  allows  one  to 
slmulute  important  performance  measures  of  INDIVIDUAL  sampling 
for  u  fixed  q,  e.g.,  PA j .  We  have  utilized  simulated  median  PA.’ 

us  target  values  for  PA^, ,  thereby  determining  the  B-plan 

conversion. 

Not  to  be  overlooked  are  the  important  considerations  of  how 
one  responds  to  rejected  lots  and  its  impact  on  uverage  outgoing 
quality  (AOQ).  The  direct  application  of  the  B-plan  without 
regard  to  alternative  screening  rules  may  substantially  effect 
AOy,  und  consequently  average  fraction  inspected  ( AF I  )  ,  as 
compared  to  the  INDIVIDUAL  plnn.  Future  work  will  include  an 
analysis  of  the  conversion  problem  from  the  standpoint  of  AOQ, 

aimed  at  picking  the  "best"  GROUP  screening  rule,  given  (n,c)m 


;sa 

a 


Proof  of  (2.1):  to  handle  the  constraints  0<q^l,  set  q^-  e  . 

By  taking  logs  of  objective  function  (1.2)  and  constraint  (1.3), 
the  problem  takes  the  form 


ln{OC(e  ^  ;  n  ,  c  ) }  , 

)-l 


subject  to  G  ( X )  -  x  j 

J-1 


ln(q)  . 


According  to  the  Lagrange  multiplier  theory,  extrema  occur  when 
all  purtluU  of  F(X,^)  -  L(  \  )  +  A<G(  X  )  +  1  n  (  q  ) )  are  zero. 


The  computations  rely  on 

d  ^dq (OC( q  in , c  )  ]  -  q 


n  -  c  - 1 


(  1  -q )C/B( n-c  ,  c  +  1  ) , 


m 


1*1 
*9  #r 


which  is  just  u  restatement  of  OC(  q  ;  n  ,  c  )  -  BET(  q  ;  n  -  c  ,  c  + 1  )  ,  and 


(X'(q;n,c)-  q  (l-q)  IK  q  )  ,  where  H(q)  - 


cnj)  lq/d-q)>J 


I'M' 


Therefore,  lor  l^i^m, 


d/dx  (F(X.A)]  -  2x  i  { ^  -  l/(B(n-c ,c  +  l )H(q .  )] ) . 


v3 


implying  that  extreme  occur  ut  vectors  Q  whose  components  either 
(a)  equal  1,  or  (b)  satisfy,  for  q.  and  q.  not  1,  M ( q  4 ) -  H ( q  ^ ) . 

Since  1 1  ( q  )  is  monotone  increasing,  1 1  ( q .  )  —  H(qj)  implies  q,«q., 


bo  to  satisfy  the  constraint  FCX.^) )  •  G(X)+ln(q)  •  0, 

it  follows  that  optimal  Q  have  (m-k)  components  equal  to  1, 

1  /k 

and  k  components  equal  to  q  ,  for  k»1,2...  ,m.  To  determine 
which  values  of  k  correspond  to  the  max  and  min,  consider 

gf  t )  •  (OC(q 1 / X ; n , c ) J  * .  Fcr  integer  t>1 ,  g ( t )  is  the  objective 

function  (1.2)  evaluated  at  optimal  Q  vectors.  Since  g(t)  is 
decidedly  monotone  increasing  (see  [7]),  the  min  and  max  occur  at 
k»l  and  k-m  respectively,  producing  (2.1).  Q.E.D. 


APPENDIX  ,2 

Proof  of  (4.2):  make  the  change  of  variable 


m-  1 

‘  *•'  11  li 


m 


i  •  1 


The  joint  dens i ty 


m- 1 


.q2 . qm.,)  -  n  r,<q,) 

i  -1 


m-  1 


nlg(Pm;a,bm)II  (nlgfq^a.b.J/q.)  . 

i-l 


m 


KVI 


"i 


r."  ^*1*."' 

y.y. 


y.-.vy 


Tb«  conditional  joint  density  is 


**^1  '**2  ’  *  *  ^bj-  1  I  ^  ^  »^2*  *  *  *  '  qro. 


FlVF3  . 


m- 1 


b  -1 


where 


H,  -  n  (I 1"( 1/q.  )]  1  /q  4  ) 

'  i-1 


S-l  Si*1 

F2  -  ( 1 n < 1 / Pm ) ]  m  / [ ln( 1 /Pj  )  )  1 


m- 1 


TT  S  .  *  -  1  S,  •  1 

-  U  { [  ln(l/P.  Al )  j  1+1  /(InO/Pj))  1  }  , 


i  1 


i-1 
m- 1 


ro- 1 


and 


f  ,  -  (IlrCb.H'rfs.)  -  II  B(S  ,  ,  b .  ) 

3  i  1  1  4  1  i 

i-1  i-1 


m- 1 


Therefore  h  -  J|  be  t  ( T  .  (  q  .  ) ;  S  .  +  j  ,  b  .  )dT  ^  (  q  {  ) /dq  .  . 

i  -1 


Integration  of  h  with  respect  to  qt  over  [P(,*l  if  i-k,  and 


C P I . 1 3  if  ifk,  for  i-1 , 2 . m- 1  and  a  specified  k.  then  setting 

i  back  to  q(,  produces  (4.2).  Q.E.D. 


4o:> 
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papers  presented  at  that  meeting.  These  articles  treat  various  Army  statistical 
and  design  problems. 
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